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Preface 


The  purpose  of  this  research  is  to  develop  a 
validation/verification  methodology  for  dependent  work 
breakdown  structure  (WBS)  cost  element  cost  risk  models.  Two 
general  failure  modes  exist  for  dependent  cost  risk 
methodologies.  The  first  failure  mode  is  when  the  model  fails 
due  to  improperly  specified  input  parameters.  The  second 
failure  mode  is  when  the  model  fails  because  the  methodology 
does  not  properly  act  on  the  inputs  with  valid  user 
specifications . 

A  specific  investigation  into  the  Air  Force  Risk  Model  was 
accomplished.  A  Comparison  Model  was  developed  to  determine 
if  and  where  the  model  failed.  If  the  model  fails  then  a 
determination  of  whether  it  failed  because  of  the  methodology 
or  the  implementation  must  be  made.  The  cost  risk  methodology 
affect  on  twenty-five  pairs  of  triangular  distributions  is 
evaluated . 

In  doing  my  research,  I  am  greatly  indebted  to  my  thesis 
committee,  Capt  W.  P.  Simpson  (Ph.D.),  Dr.  R.  Murphy,  and  Dr. 
R.  Fenno.  I  am  indebted  to  Mr.  J.  P.  (Pete)  Barnum  at  Los 
Angeles  AFB  who  suggested  the  thesis  topic.  I  also  thank  Capt 
Fenimore  for  his  WordPerfect*  help.  Finally,  I  wish  to  thank 
my  wife,  Cindy,  and  newborn  son.  Tommy,  for  their  support, 
patience  and  understanding. 


Thomas  R.  O'Hara 
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Abstract 

^  This  study  develops  a  dependent  component  cost  risk  model 
validation  methodology  and  applies  it  to  the  Air  Force  Risk 
Model.  The  validation  process  consists  of  ensuring  that 
logically  consistent  input  parameters  are  acted  on  properly  by 
the  cost  risk  methodology.  Users  of  al]  dependent  component 
risk  models  must  be  concerned  with  logically  consistent  input 
parameters.  Two  criteria  define  logical  consistency.  The 
first  is  the  correlation  matrix  consistency  and  the  second  is 
the  consistency  between  pairs  of  cost  distributions.  Three 
validation  criteria  are  defined  and  used  to  validate  a  cost 
risk  model.  The  first  criterion  is  that  the  process  must 
maintain  the  user  defined  correlations.  The  second  criterion 
is  that  the  total  cost  distribution  mean  and  variance  be 
congruous  with  the  analytical  value.  The  third  criterion  is 
that  properly  specified  input  parameters  not  be  altered  by  the 
cost  risk  process.  A  Comparison  Model  was  developed  in 
Quattro*  Pro  to  validate  the  general  Air  Force  Risk  Model 
methodology.  Twenty-five  pairs  of  work  breakdown  structure 
cost  elements  are  defined  and  rested  in  the  Comparison  Model. 
The  final  research  produce  is  a  table  illustrating  the  narrow 
conditions  where  the  Air  Force  Risk  Model  is  valid. 


AN  INVESTIGATION  OF  THE  AIR  FORCE  RISK  MODE 


1.  Introduction 


Background 

All  Department  of  Defense  services  are  concerned  with 
weapon  systems  cost.  Decision  makers  want  to  receive 
accurate  cost  estimates,  and  the  cost  analyst.'s  goal  is  to 
provide  accurate  information  to  the  decision  makers. 
Furthermore,  DoDI  5000.2  (Defense  Acquisition  Management 
Policies  and  Procedures)  requires  that  Cost  Analysis 
Improvement  Group  (CAIG)  briefings  characterize  the  cost 
risk  associated  with  cost  estimates  (7:13-C-3).  Therefore, 
cost  analysts  require  a  tool  to  evaluate  the  inherent  risk 
or  uncertainty  in  any  weapon  system  acquisition  program. 

One  argument  for  using  a  cost  risk  model  based  on 
statistical  analysis  is  that  it  prov  es  the  user  with  a 
quantitative  justification  for  resources  added  (subtracted) 
from  a  point  estimate  as  opposed  to  a  simple  factor  applied 
to  all  estimates.  This  research  develops  a  cost  risk 
val idation/verif ication  methodology  and  applies  it  to  a 
probabilistic/statistical  cost  risk  model. 

Total  cost  estimates  are  the  summation  of  the  lower 
level  work  breakdown  structure  cost  elements.  A  Work 
Breakdown  Structure  (WBS)  is  defined  by  Military  Standard 
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881A  (MIL-STD-881A)  as 

a  product-oriented  family  tree  composed  of  hardware, 
services  and  data  which  result  from  project 
engineering  efforts  during  the  development  and 
production  of  a  defense  materiel  item,  and  which 
completely  defines  the  project/program.  A  WBS 
displays  and  defines  the  ?roduct(3)  to  be  developed 
or  produced  and  relates  the  elements  of  work  to  be 
accomplished  to  each  other  and  to  the  end  product. 
(17:2) 

The  WBS  is  broken  down  into  levels.  Cost  estimates  are 

usually  developed  at  the  level  3  or  lower.  The  following 

definitions  of  WBS  levels  are  from  MIL-STD-881A : 

Level  1  is  the  entire  defense  materiel  item:  for 
example,  the  Minuteman  ICBM  System,  the  LHA  Ship 
System,  or  the  M-109A1  Self-Propelled  Howitzer 
System.  Level  1  is  usually  directly  identified  in 
the  DoD  programming/budget  system  either  as  an 
integral  program  element  or  as  a  project  within  an 
aggregated  program  element. 

Level  2  elements  are  major  elements  of  the  defense 
materiel  item:  for  example,  a  ship,  an  air  vehicle, 
a  tracked  vehicle,  or  aggregations  of  services, 

(e.g,,  systems  test  and  evaluation);  and  data. 

Level  ,1.  olem.ents  are  elements  subordinate  to  level  2 
ma  cr  e'ement?:  for  example,  an  electric  plant,  an 
airframe,  the  power  package/drive  train,  or  type  of 
service,  (e.g.,  deveicpment  test  and  evaluation);  or 
item  of  data  (e.g.,  technical  publications)  (17:2- 
3) 

An  example  from  MIL-STD-881A  of  the  WBS  levels  is  shown 
in  Figure  1.  The  air  vehicle,  training,  and  peculiar 
support  equipment  are  the  first  three  Level  2  breakouts. 
These  are  further  subdivided  into  their  respective  Level  3 
breakouts  as  .shown.  A  similar  work  breakdown  structure  is 
available  in  MIL-STD-881A  for  other  Air  Force  weapon  system 
types  as  well  as  Army  and  Navy  weapon  systems.  This 
breakout  provides  logical  order  to  cost  estim.ating  and  also 
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Level  1  Level  2 

Aircraft  system 


Level  3 


Air  vehicle 

Airframe 
Propulsion  unit 
Other  propulsion 
Communications 
Navigation/ guidance 
Fire  control 
Penetration  aids 
Reconnaissance  equipment 
Automatic  flight  control 
Central  integrated  checkout 
Antisubmarine  warfare 
Auxiliary  electronics  equipment 
Armament 

Weapons  delivery  equipment 
Auxiliary  armament /weapons 
delivery  equipment 

Training 

Equipment 

Services 

Facilities 

Peculiar  support  equipment 

Organizational /intermediate 
(Including  equipment  common 
to  depot) 

Depot  (Only) 


Figure  1  MIL-STD-881A  Work  Breakdown  Structure  (First  3 
Level  2  breakouts)  (16:17-18) 


performs  the  function  of  maintaining  some  consistency  in 
cost  estimating  structure  between  various  programs.  Cost 
estimates  may  actually  be  developed  at  lower  than  the  3rd 
level,  which  provides  additional  detail  into  the  cost 
estimate. 

The  total  cost  point  estimate  is  the  summation  of  all 
lower  level  point  estimates.  The  point  estimate  is  usually 
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interpreted  as  the  mean  for  each  cost  element.  Typically 

the  cost  estimating  process  is  the  summation  of  cost  element 

means  to  generate  the  total  cost  point  estimate. 

The  Air  Force  Systems  Command  Cost  Estimating  Handbook 

defines  cost  risk  as  follows: 

Risk  and  uncertainty  refer  to  the  fact  that,  because 
a  cost  estimate  is  a  prediction  of  the  future,  there 
is  a  chance  that  estimated  cost  may  differ  from 
actual  cost.  Moreover,  the  lack  of  knowledge  about 
the  future  is  only  one  possible  reason  for  such  a 
difference.  Another  equally  important  cause  is  the 
error  resulting  from  historical  data 
inconsistencies,  cost  estimating  equations,  and 
factors  that  are  typically  used  in  an  estimate. 

(2:13-1  to  13-2) 

Cost  risk  analysis  is  the  quantification  of  estimating 
methodology  uncertainty  in  the  total  cost  distribution. 

There  is  some  uncertainty  with  any  estimate.  From  Jago,  the 
analyst  has  many  tools  available  to  generate  component  cost 
estimates.  Cost  risk  analysis  is  a  tool  available  to 
account  for  some  of  this  uncertainty  (12:4).  Since  cost 
risk  analysis  is  another  prediction,  it  only  quantifies  the 
confidence  in  the  estimate. 

Cost  risk  analysis  is  applied  to  cost  estimates  through 
the  WBS.  There  is  a  distribution  of  cost  for  each  WBS  cost 
element.  Each  cost  element  has  an  associated  probability 
density  function  (p.d.f.).  The  probability  density  function 
represents  the  distribution  of  probability  for  an  event 
occurrence  (20:137).  The  point  estimate  or  mean  cost  for  a 
WBS  cost  element  will  vary  as  a  function  of  the  methodology 
used  in  generating  that  cost  estimate. 
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According  to  Murphy,  typically  the  cost  analyst  will  use 
the  mean  cost  estimate  based  on  the  method  that  is  most 
applicable  to  the  subsystem  and  the  weapons  program.  In 
applying  the  cost  risk  methodology,  the  WBS  element  mean 
cost  is  interpreted  as  the  most  likely  cost  estimate.  The 
lowest  likely  and  highest  likely  cost  are  determined  by  the 
prediction  interval  around  the  mean  cost.  The  prediction 
interval  level  is  left  for  the  user  to  decide.  That  is, 
should  the  prediction  interval  capture  80%,  90%  or  99%  of 
the  cost  estimate  with  that  particular  cost  estimating 
methodology  (18)?  Neter,  Wasserman  and  Kutner  define  the 
prediction  interval  as  the  area  under  the  prediction 
probability  density  function  for  a  given  mean.  For  example, 
a  cost  estimate  ±  3o  would  be  a  99.87%  prediction  interval 
around  the  mean  (N(u,  o^) ) .  The  highest  (lowest)  likely 
cost  estimate  would  be  at  the  +  3o  (-  3o)  point  (19:80-81). 

Cost  estimating  risk  analysis  is  the  function  that  cost 
analysts  perform  before  they  present  the  point  estimate  to 
decision  authorities.  The  total  cost  point  estimate  from 
cost  risk  analysis  represents  the  median  cost  for  a  weapon 
system.  The  cost  risk  process  uses  the  mean  cost  of  lower 
level  elements  to  determine  the  median  total  cost. 

Typically  analysts  will  report  two  costs  along  a  cumulative 
probability  distribution  function  (c.d.f.)  at  the  fifty  and 
seventy  percent  probability  levels.  The  cumulative 
probability  distribution  function  expresses  the  probability 
that  a  cost  does  not  exceed  a  specified  value  (20:185). 
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From  the  AFSC  Cost  Estimating  Handbook,  the  fifty  percent 
confidence  level  from  the  cost  risk  process  represents  the 
median  value  of  the  total  cost  distribution,  which  means 
there  is  a  fifty  percent  probability  that  actual  cost  will 
exceed  the  estimated  cost  (2:13-13).  Similarly,  the  seventy 
percent  confidence  means  that  there  is  a  thirty  percent 
probability  of  exceeding  the  cost  estimate.  Cost  risk 
analysis  techniques  assume  that  the  program  remains  constant 
as  it  quantifies  uncertainty  in  the  cost  estimating 
methodology  (2:A-16,  13-1  to  13-2).  It  does  not  account  for 
Congressional  actions,  strikes,  or  natural  phenomena  that 
occur  unexpectedly. 

Cost  risk  for  the  total  system  is  defined  by  using  the 
p .d. f . /c .d. f .  for  total  system  cost.  The  method  of 
generating  the  total  system  cost  p.d. f ./c.d. f .  depends  on 
the  assumptions  of  cost  dependency  and  the  shape  of  the  WBS 
cost  element  distributions.  The  amount  of  estimate 
confidence  is  indicated  by  the  total  cost  distribution 
cumulative  distribution  function. 

Cost  risk  analysis  methodologies  (refer  to  Figure  2) 
rely  on  the  definition  of  cost  distributions  for  each  cost 
element.  The  analyst  needs  to  define  the  mean,  lowest 
likely  cost,  highest  likely  cost,  variance,  distribution 
shape  and  pairwise  correlations  (correlation  coefficient,  p) 
(12:1-12) . 

The  mean  and  variance  of  total  cost  can  be  determined 
analytically  (19:5-6),  Cost  risk  methodologies  must  either 
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Figure  2  The  cost  risk  analysis  process 


make  assumptions  about  the  shape  of  the  total  cost 
distribution  or  determine  the  shape  by  simulation  methods. 
Therefore  the  cost  risk  analysis  process  is  the  summation  of 
all  lower  level  cost  distributions.  The  summation  of 
independent  probability  density  functions  to  determine  the 
total  probability  density  function  shape  can  be  accomplished 
with  convolution  (21:317). 

Convolution  of  probability  density  functions  may  be 
calculated  with  at  least  two  methods:  analytically  and 
simulation.  The  reader  interested  in  analytical  methods 
should  reference  any  general  statistical /probabi 1 ity  text 
such  as  Parzen's  Modern  Probability  Theory  and  Its 
Applications  (21:317).  The  simulation  convolution  method 
sums  one  sample  from  each  distribution  to  form  the  sum  of 
the  total  cost  distribution  for  several  samples  (100  to  1000 
samples)  (2:13-29  to  13-32). 
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The  most  commonly  used  cost  risk  methodology  within  the 
Air  Force  is  the  Air  Force  Systems  Command  (AFSC)  Risk 
Model.  The  AFSC  Risk  model  assumes  that  all  cost  elements 
within  the  WBS  are  statistically  independent.  By  assuming 
that  each  subsystem  element  is  independent,  the  model 
misestimates  the  total  program  cost  i£  the  cost  elements  are 
dependent.  Garvey  states  that  if  all  weapon  subsystem 
correlations  are  positive,  then  the  total  cost  is 
underestimated  (10:5). 

Unfortunately,  WBS  cost  element  dependencies  exist.  From 
Murphy,  weapon  system  component  costs  are  driven  by  the 
physical  and  performance  parameters  that  describe  the 
system.  The  physical  and  performance  parameters  are  driven 
by  the  threat  for  which  the  weapon  system  is  designed. 
Therefore,  the  overall  system  characteristics  are  relatively 
constant  to  the  threat.  However,  intrasystem  trades  do 
exist  while  maintaining  the  same  overall  goal.  The  physical 
characteristics  for  each  component  will  have  a  specific 
interrelationship  for  any  given  weapon  system.  These 
interrelationships  drive  the  cost  correlations.  Therefore, 
the  cost  correlations  are  not  spurious  statistical 
relationships  (18). 

Weapon  subsystem  cost  dependencies  can  be  further 
understood  with  two  simple  examples.  When  estimating  an 
aircraft  the  WBS  may  include  the  level  3  elements  airframe 
and  propulsion  unit.  If  the  weight  of  the  airframe  is 
increased  (thus  increasing  the  cost)  then  the  propulsion 
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unit  must  be  increased  in  some  way  to  handle  this  increased 
weight.  In  very  general  terms,  the  propulsion  system  power 
is  increased  to  meet  the  increased  demand  of  weight.  Again 
in  very  general  terms,  both  actions  would  most  likely 
increase  the  cost  of  their  respective  subsystems.  This 
would  indicate  a  positive  cost  correlation  between  these 
cost  elements. 

Cost  correlation  relationships  are  not  always  positive. 
Some  subsystems  cost's  decrease  as  another  subsystem 
increases  in  cost.  Consider  a  target  seeking  missile  such 
as  a  kinetic  kill  vehicle  (Strategic  Defense  Initiative)  or 
an  air-to-air  missile.  The  WBS  for  this  weapon  system  would 
include  some  type  of  sensor  (active  or  passive)  and  a 
propulsion  system.  If  the  sensor  acquires  the  target  at  a 
greater  range,  the  propulsion  system  does  not  have  to 
produce  as  much  energy  as  a  sensor  that  detects  a  target  at 
a  shorter  range.  The  sensor  that  detects  the  target  at 
greater  range  is  more  expensive  than  the  sensor  that  detects 
its  target  at  short  range.  Also  the  propulsion  system  that 
produces  greater  energy  is  more  expensive  than  one  with  less 
energy.  There  is  a  negative  cost  correlation  exhibited  by 
this  example.  As  one  subsystem  increases  in  cost  the  other 
subsystem  decreases  in  cost. 

Devaney  and  Popovich  showed  in  their  research  that  the 
cost  dependency  between  weapon  system  components  should  be 
an  important  consideration  in  cost  risk  models.  Cost  risk 
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analysis  techniques  have  traditionally  assumed  that  the  WBS 
elements  are  statistically  independent  (8:77). 

There  are  several  methods  available  to  evaluate  risk. 
Garvey  and  Abramson  &  Young  developed  analytical  cost  risk 
models.  Garvey’s  model  is  called  the  Analytic  Cost 
Probability  (ACOP)  model  and  Abramson’s  and  Young's  model  is 
called  the  Formal  Risk  Evaluation  Methodology  (FRISKEM) 

(1:1:  10:1).  Both  works  will  be  reviewed  in  Chapter  II. 

This  thesis  will  concentrate  on  the  Air  Force  Risk  Model 
(referred  to  as  the  Tecolote  Risk  Model  in  this  study)  which 
is  a  new  model  under  development  by  Tecolote  Research  Inc. 
and  contracted  by  the  US  Air  Force  Cost  Center  (AFCC) .  The 
Air  Force  Risk  Model  is  designed  to  estimate  cost  risk  in 
the  presence  of  cost  dependencies  or  correlations  between 
WBS  weapon  subsystems  (12:9-10). 


Verification  and  Validation 

Verification  and  validation  are  defined  by  Law  and 
Kelton  as: 

Verification  is  determining  whether  a  simulation 
model  performs  as  intended,  i.e.,  debugging  the 
computer  program. .. Val idation  is  determining  whether 
a  simulation  model  (as  opposed  to  the  computer 
program)  is  an  accurate  representation  of  the  real- 
world  system  under  study.  (14:333-334) 

Banks  and  Carson  define  verification  and  validation  as: 

Verification  pertains  to  the  computer  program 
prepared  for  the  simulation  model.  Is  the  computer 
program  performing  properly?  Validation  is  the 
determination  that  a  model  is  an  accurate 
representation  of  the  real  system.  (3:14) 
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This  research  will  determine  under  what  conditions  a 
risk  model  methodology  is  valid.  Furthermore,  by  comparing 
the  output  of  any  model  with  another  user  defined  model  it 
will  verify  the  methodology's  implementation. 

Validation  Criteria 

The  validation  process  is  exhibited  in  Figure  3.  There 
are  two  types  of  risk  model  failure  modes.  The  first 
failure  mode  (further  divided  into  failure  modes  la  and  lb) 
occurs  when  inputs  are  not  properly  specified.  This  is  the 
user's  burden.  That  is,  the  user  is  responsible  for 
specifying  proper  inputs.  The  second  failure  mode  (failure 
mode  2)  occurs  when  the  methodology  does  not  properly  act  on 
correctly  specified  user  inputs.  This  is  the  failure  due  to 
the  model's  methodology.  The  first  failure  mode  is 
subdivided  into  two  types  of  failures.  The  first 
subdivision  is  failure  mode  la  and  it  is  when  the 
correlation  matrix  is  not  internally  consistent.  The  second 
subdivision  is  failure  mode  lb  and  it  is  when  the  cost 
element  distributions  are  not  consistent  with  the  user 
specified  correlation  matrix.  Once  the  input  parameters 
fail  at  lb,  the  user  may  change  either  the  shape  of  the 
distribution  or  the  pairwise  correlation.  The  remainder  of 
this  research  assumes  that  the  shapes  are  changed  to  the 
correlation.  However,  changing  the  correlation  to  the  shape 
is  equally  valid.  A  set  of  criteria  (described  later  in 
this  chapter)  can  be  developed  to  validate  the  model  in 


11 


Figure  3  Cost  risk  methodology  validation  criteria 
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reference  to  failure  mode  2.  The  user  should  be  advised 
when  he/she  misspecifies  parameters  or  when  the  model  fails 
to  act  properly  on  the  user's  specifications. 

Failure  mode  la  is  understood  with  a  simple  correlation 
matrix  example.  Murphy  describes  a  three  element  WBS, 
elements  A,  B,  and  C.  Element  A  has  a  high  positive 
correlation  with  both  elements  B  and  C.  This  relationship 
forces  a  positive  correlation  between  elements  B  and  C.  The 
correlation  matrix  would  then  be  logically  consistent  (18). 

The  question  of  consistency  within  the  correlation 
matrix  is  fairly  easy  to  verify.  Searle  states  that  the 
correlation  matrix  is  non-negative  definite  (either  positive 
semi-definite  or  positive  definite)  (23:348-349).  The  test 
for  positive  definiteness  and  positive  semi-def inetness  is 
covered  in  chapter  III. 

Failure  mode  lb  requires  highly  correlated  WBS  cost 
element  distributions  to  have  approximately  the  same  shape. 
This  is  evident  with  three  simple  examples.  First  consider 
Figure  4,  the  case  of  two  identically  distributed  cost 
element  distributions.  If  these  distributions  are 
correlated,  the  correlation  should  be  positive.  Any 
negative  correlation  would  be  inconsistent  with  the  cost 
distribution's  shape  (18).  As  the  cost  of  one  element 
increases,  the  other  element  should  increase  or  remain 
constant  (19:5,  522).  Consider  Figure  5  with  two  WBS  cost 
element  distributions,  both  right  triangles,  one  skewed 
right  and  the  other  skewed  left.  Thus,  these  are  opposing 
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Figure  4  Two  identical  triangular  cost  distributions 

right  triangular  distributions.  Any  positive  correlation 
would  be  an  inconsistent  user  specification  (18).  As  the 
cost  of  one  element  increases,  the  cost  of  the  other  should 
decrease  or  remain  constant  (constant  would  asstune  that  the 
two  cost  elements  are  statistically  independent).  This  is 
an  assertion  from  basic  statistical  theory  relating 
correlation  and  covariance  (19:5,  522).  A  third  possibility 
is  of  symmetrical  cost  distributions  (Figure  6).  Two 
symmetrical  distributions  may  be  either  positively  or 
negatively  correlated.  If  the  two  distributions  are 
positively  correlated,  then  the  costs  change  in  the  same 
direction.  However,  an  equally  valid  possibility  is  that 
one  will  decrease  in  cost  as  the  other  increases  in  cost 
(negative  correlation).  Since  an  equal  area  under  the  cost 
element  probability  density  function  is  covered  during  the 
change,  the  correlation  consistency  remains  valid  (18).  The 
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Figure  5  Two  opposing  right  triangular  cost  distributions 

user  should  not  misunderstand  that  cost  distributions  must 
have  some  correlation.  Cost  element  distributions  may  also 
be  independent.  Also  the  user  needs  to  be  aware  that  there 
is  a  large  "gray"  area  where  there  is  not  such  a  clear  cut 
difference  between  logical  and  illogical  correlated 
distributions.  The  three  cases  stated  above  are  simple 
examples  for  illustration  purposes.  Distributions  found  in 
the  real  world  will  be  much  more  complex,  and  the  analyst 
should  take  great  care  in  applying  any  risk  methodology  that 
considers  dependency  among  components. 

There  are  three  criteria  that  identify  Failure  mode  2. 

A  valid  cost  risk  methodology  will  pass  all  criteria.  The 
criteria  that  identify  failure  modes  are:  2a.  The  user 
defined  component  correlations  should  be  maintained  through 
the  cost  risk  model  (i.e.,  input  p  =  output  p) .  2b.  The 

total  cost  mean  and  variance  calculated  by  the  cost  risk 
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Figure  6  Two  symmetrical  triangular  cost  distributions 

model  should  be  equal  to  the  analytical  total  cost  mean  and 
variance.  2c.  The  input  WBS  cost  ;  -.nent  probability 
density  function  shapes  s‘'.ouid  be  the  same  as  the  output 
shapes . 

The  user  defined  correlations  must  be  applied  to  the 
cost  distributions  through  the  cost  risk  analysis  process. 
The  validation  of  this  is  accomplished  by  criterion  2a.  The 
Tecolote  Risk  Model  methodology  satisfies  validation 
criterion  2a  as  shown  by  Book  and  Young  in  their  paper  at 
the  24th  Annual  DoO  Cost  Symposivun  (4:11).  The  results  of 
their  research  will  be  shown  in  Chapter  IV. 

The  total  cost  mean  and  variance  may  be  derived 
analytically  (19:5-6).  The  cost  risk  process  should 
calculate  the  same  values  as  calculated  analytically. 
Criterion  2b  may  be  confirmed  by  simply  comparing  the 
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summary  statistics  of  the  simulation  output  values  with  the 
analytically  determined  values. 

The  total  cost  distribution  should  behave  in  the  manner 
as  stated  by  standard  statistical  methods.  Basic 
statistical  theory  as  discussed  by  Neter,  Wasserman  and 
Kutner  show  the  statistical  relationships  between  random 
variables  used  in  risk  analysis.  The  cost  element 
distributions  may  be  summarized  by  two  statistics.  The 
first  is  the  mean  of  the  sums  is  the  sum  of  the  means. 

and  the  second  is  the  variance  of  the  sums  is  equal  to  the 
sum  of  the  variances  plus  two  times  the  covariances. 

Specifically,  for  n  =  2  the  relationships  are: 

E  (  Y^*Y^  )  •  E[  Yj^)*B{  Y^)  { 3  ) 

and 

a>  {  )  -  o*  {  )  +  o*  {  y, )  +  2o  {  }  (4) 

The  dependency  between  pairs  of  random  variables  is 
indicated  by  the  covariance,  represented  by  o{Y,,Y2),  and  is 
defined  as  follows: 
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a  {Yj^,  Y^  •  S  {(Yj^-B  {y^})  {Y^-S  (Kj)))  -  B  iYj^Y^)-  (B  {y^})  (B  {y^}) 


(5) 


Correlation,  represented  by  p,  is  the  standardized 
covariance  between  two  WBS  cost  elements.  This  is 
represented  by  the  following  equation: 


p  = 


o  {y^y^) 
o  {y^)  a  iyj) 


(6) 


The  equations  shown  are  from  Neter,  Wasserman  and  Kutner 
(19:5-6,  522). 

Criterion  2c  may  be  confirmed  by  comparing  the  input  and 
output  distributions  of  the  cost  risk  process  with  a 
goodness  of  fit  test.  The  analyst  using  any  risk  model 
should  expect  to  get  the  same  distribution  out  of  the  risk 
analysis  process  that  the  analyst  inputs.  According  to 
Murphy,  the  cost  distributions  that  are  used  as  inputs 
already  have  history.  That  is,  they  are  already  correlated 
to  each  other.  It  is  difficult  to  develop  cost 
distributions  independently  of  each  other  (18).  Thus  if  the 
user  inputs  a  triangular  distribution,  he/she  should  in 
return  be  generating  random  deviates  (variates)  from  a 
triangular  distribution.  This  research  will  test  pairs  cf 
distributions  with  a  range  of  correlations.  It  will  show  at 
what  correlation  the  cost  risk  methodology  fails  to  produce 
similar  post  cost  risk  analysis  process  distributions. 

The  above  criteria  may  be  used  to  validate  any  risk 
methodology  which  considers  dependencies  between  WBS  cost 
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elements.  This  is  exhibited  in  Figure  3  by  showing  that  the 
user  first  defines  the  parameters  and  then  the  user  verifies 
that  they  are  consistent  in  both  pairwise  correlations  and 
distribution  shapes. 

A  Comparison  Model  is  developed  to  accomplish  criterion 
2c.  This  model  uses  the  same  methodology  as  described  by 
Jago  (12:1-12)  and  Book  &  Young  (4:1-19).  The  Tecolote  Risk 
Model  was  not  available  for  this  research  and  the  executable 
code  did  not  offer  the  necessary  research  data. 

Verification  Criteria 

The  Tecolote  Risk  Model  will  be  verified  using 
validation  criterion  2b.  The  total  cost  mean  and  variance 
will  be  compared  to  the  analytical  values.  The  Air  Force 
Risk  Model  computer  program  was  not  available  for  this 
research;  therefore,  criteria  2a  and  2c  could  not  be 
accomplished . 

Specific  Problem 

The  Tecolote  Risk  Model  is  a  Monte  Carlo  model  that  uses 
Cholesky  decomposition  to  transform  independent  random 
deviates  (variates)  to  dependent  random  deviates  according 
to  the  user  specified  correlations.  The  focus  of  this 
research  is  to  apply  the  validation  methodology  to  the 
Tecolote  Risk  Model.  Furthermore  this  research  will 
investigate  the  validity  of  the  Cholesky  decomposition  as  a 
risk  analysis  WBS  cost  element  correlation  methodology.  The 
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specific  task  of  this  research  is  to  apply  the  validation 
criteria  (Failure  Mode  2)  to  the  output  generated  from  valid 
user  specifications  (inputs  that  pass  Failure  Mode  1). 

Hypothesis 

The  hypothesis  will  test  the  Tecolote  Risk  Model  for 
the  three  criteria  using  logically  consistent  correlations 
and  distributions.  The  hypothesis  test  is: 

Hg*.  The  Tecolote  Risk  Model  is  a  valid  Methodology 
Hj:  The  Tecolote  Risk  Model  is  not  a  valid  Methodology 
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II.  Literature  Review 


Overview 

With  one  exception,  this  review  of  the  literature  will 
sunrmarize  cost  risk  analysis  techniques  that  consider 
dependency  among  cost  elements.  The  exception  is  the 
general  Monte  Carlo  Method  model.  These  techniques  may  be 
divided  into  two  general  categories:  analytical  and 
simulation  methodologies.  This  chapter  will  also  cover  the 
definitions  of  cost  risk  analysis  and  cost  contingency 
analysis . 

Analytical  risk  analysis  techniques  available  now  are 
those  that  make  assumptions  about  the  shape  of  the  total 
cost  distribution  and  use  standard  statistical  formulas  to 
provide  the  cumulative  probability  on  the  c.d.f.  The 
Analytical  Cost  Probability  (ACOP)  Model  assumes  that  the 
total  cost  distribution  is  a  normally  distributed  variable 
(10:5).  The  Formal  Risk  Evaluation  Methodology  (FRISKEM) 
Model  assumes  that  the  total  cost  distribution  is  a 
lognormally  distributed  variable  (1:4). 

Simulation  methods  generally  use  the  Monte  Carlo  method 
to  derive  the  total  cost  distribution  by  sampling  the  input 
distributions  and  then  use  convolution  to  obtain  the  shape 
of  the  total  cost  distribution.  Convolution  is  a 
mathematical  method  of  summing  two  or  more  statistically 
independent  probability  density  functions.  The  Air  Force 
Systems  Command  Risk  Model  uses  this  technique  for 
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independent  WBS  cost  distributions.  The  Tecolote  Risk  Model 
uses  convolution  in  addition  to  Cholesky  decomposition  to 
correlate  the  independent  random  deviates  (variates)  in  the 
Monte  Carlo  simulation  to  form  the  total  cost  distribution 
(12:9-12).  The  Tecolote  Risk  Model  is  thus  a  series  of 
steps,  which  are  generate  statistically  independent  random 
deviates,  standardize  random  deviates,  correlate  independent 
random  deviates  using  the  Cholesky  decomposition, 
destandardize  random  deviates,  and  then  sum  the  lower  level 
cost  elements  using  convolution  to  form  the  total  cost 
distribution  (12:4-12). 

Devaney  and  Popovich,  in  their  literature  review  in  1985 

showed  that  existing  models  either  ignored  cost  dependencies 

or  assumed  that  there  was  total  cost  dependence.  In  either 

case,  the  total  cost  is  misestimated.  However,  by  doing 

cost  risk  analysis  under  both  assumptions,  independence  and 

total  positive  dependence,  the  risk  analysis  output  will 

typically  provide  a  bound  to  the  true  estimate  (8:14-29). 

General  Definitions.  Jago  defines  the  four  elements  of 

uncertainty  that  the  AF  Risk  model  considers.  The  elements 

are  estimating,  scheduling,  technology,  and  configuration 

uncertainties  and  are  defined  as: 

Estimating  uncertainty  establishes  a  band  around  an 
estimate  showing  the  probable  error  in  the  estimate. 

It  is  measured  in  units  of  cost.  Estimates  for  the 
elements  of  a  Work  Breakdown  Structure  are  developed 
by  a  variety  of  methods,  each  with  its  own 
characteristic  estimating  uncertainties. 

Four  basic  estimating  methods  are  now  in  common  use. 
These  are:  (1)  Cost  Estimating  Relationships 
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(CERs),  (2)  Factors,  (3)  Analogies,  and  (4) 

Engineering  Build-Up.  CERs  and  Factors  can  be 
grouped  since  they  are  statistical  in  nature  and 
represent  expected  values  derived  from  some  data 
base.  Similarly,  Analogies  and  Engineering  Build-Up 
are  based  on  discrete  data  points. 

Schedule  uncertainty  specifies  a  band  of  time 
usually  as  durations  or  dates.  The  types  of 
information  an  analyst  has  for  the  schedule 
estimation  problem  are  things  like  when  the  program 
starts,  intermediate  milestone  dates  (such  as  PDR  or 
CDR),  and  projected  program  completion.  Program 
schedules  usually  come  in  the  form  of  Gantt  charts 
or  networks.  Schedule  uncertainty  translates  to 
cost  uncertainties  when  activities  are  on  the 
program’s  critical  path,  for  schedules  containing  a 
high  level  of  concurrence  or  parallel  paths,  and  for 
labor  intensive  activities  (such  as  programming) . 

Technology  uncertainty  cannot  be  measured  directly 
in  either  cost  or  time,  but  rather  in  terms  of  the 
number  of  remaining  unresolved  technical  issues.  A 
good  surrogate  would  measure  its  impact  on 
successfully  achieving  critical  program  milestones 
on  schedule.  Viewed  in  this  light,  technology 
uncertainty  impacts  schedule  uncertainty  when  the 
technology  is  not  mature  when  needed,  or  when  the 
subsystem  design  incorporating  the  technology  does 
not  adequately  reflect  its  technical  performance  or 
interface  characteristics. 

Configuration  uncertainty  captures  the  changes  in 
basic  cost-driving  variables.  Thus,  if  the  cost¬ 
driving  variables  were  weight,  volume,  or  power, 
then  the  units  of  measure  might  be  in  pounds,  cubic 
feet,  or  kilowatts.  The  sources  of  configuration 
uncertainty  are  design  changes  during  development  or 
production,  or  growth  in  the  cost-driving  variables 
from  'requirements  creep'.  (12:4-5) 

To  limit  the  extent  of  this  research,  cost  estimating 
uncertainty  (referred  to  as  cost  risk  in  this  research)  is 
the  primary  focus  of  this  research.  Any  analyst  must  be 
very  careful  in  accounting  for  the  remaining  three 
uncertainties.  By  ignoring  the  schedule,  technology,  and 
configuration  uncertainties,  the  analyst  will  not  capture 
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the  true  risk  in  an  estimate.  This  author  recommends  that 
further  research  or  guidelines  be  developed  before  the 
remaining  three  categories  are  implemented. 

Contingency 

The  AFSC  Cost  Estimating  Handbook  defines  contingency 

as : 

an  allowance  or  amount  added  to  an  estimate  to  cover 
a  possible  future  event  or  condition  arising  from 
presently  known  or  unknown  causes,  the  cost  outcome 
of  which  is  indeterminable  at  a  present  time.  (2:A- 
16) 

and  contingency  analysis  as  follows: 

Repetition  of  an  analysis  with  different  qualitative 
assumptions  -  e.g.  how  well  will  equipment  perform 
on  different  terrain/type  of  conflict,  etc.  (2:A-16) 

Contingency  allowances  are  different  than  resources 

added  (subtracted)  due  to  risk  analysis  techniques.  A 

contingency  budget  could  be  used  for  anticipated  budget 

cuts,  congressional  cuts,  and  other  unknown  problems.  Risk 

budgets  are  strictly  to  compensate  for  known  problems  with 

the  cost  estimating  methodologies.  Contingency  analysis  may 

be  said  to  be  a  what-if  exercise  to  generate  multiple 

program  options  to  present  to  a  decision  maker  (2:A-16). 

This  thesis  will  not  cover  contingency  analysis  or  the 

techniques  available  for  doing  it. 

Statistical  and  Probabi 1 istic  Rel ationships 

WBS  element  cost  distributions  may  be  described  by 
summary  statistics  such  as  the  mean,  mode,  and  variance. 
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Another  way  to  describe  WBS  element  cost  distributions  is 
graphically  or  functionally.  This  second  description 
determines  the  actual  shape  of  any  probability  density 
function  (p.d.f.)  from  which  the  mode,  mean,  and  variance 
may  be  derived.  The  Normal  distribution  is  the  only 
distribution  that  is  completely  defined  by  the  mean  and 
variance.  All  other  distributions  require  additional 
moments  to  fully  describe  the  shape  (8:26).  WBS  element 
interrelationships  are  described  by  the  pairwise  correlation 
terms  between  the  elements  (19:522). 

The  sximmation  of  the  means  and  variances  of  lower  level 
cost  elements  will  result  in  the  total  cost  mean  and 
variance  (see  equations  1  and  2).  However,  this  does  not 
provide  the  shape  of  the  total  cost  distribution. 

Convolution  is  a  mathematical  method  that  computes  the  shape 
of  independent  distributions  analytically  (6:85-88). 

j 

Analytical  Cost  Risk  Methodologies 

Convolution  Overview .  The  shape  of  the  total  cost 
distribution  may  be  found  by  simulation  or  analytical 
methods.  Simulation  methods  are  discussed  in  Chapter  III. 
The  analytical  convolution  method  may  be  used  to  determine 
the  exact  summation  of  independent  distributions;  however, 
it  is  possible  that  the  solution  does  not  exist  (16:68-82). 
Therefore,  simulation  (specifically  the  Monte  Carlo  Method) 
methods  offer  a  practical  solution  to  convolution  (14:50). 
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Analytical  Cost  Probability  Model  (ACOP).  Garvey 
develops  and  provides  an  example  of  a  weapon  system 
acquisition  cost  risk  model.  The  model  is  called  the 
Analytic  Cost  Probability  (ACOP)  model  and  was  developed  at 
the  MITRE  Cost  Analysis  Technical  Center  (10:1). 

Garvey  stated  that  the  Air  Force  Systems  Command's 
Electronic  Systems  Division  requires  two  properties  in  a 
cost  risk  model.  First,  the  model  must  be  a  non-simulation 
risk  model  and  second,  it  must  take  into  account  the  effect 
of  (WBS)  element  interdependencies.  Most  cost  risk  models 
are  based  on  a  Monte  Carlo  (simulation)  method  and 
furthermore  assume*  lat  all  WBS  elements  are  statistically 
independent  (!-<•  -6). 

Garvey  -nowed  that  a  closed  form  solution  would 
alleviate  some  of  the  restrictions  in  implementing  a 
simulation  cost  risk  model,  primarily  the  long  computation 
ti.-ne  required  for  typical  Monte  Carlo  methods.  This  model 
requires  definition  of  the  WBS  element’s  distribution  type, 
most  likely  cost,  standard  deviation  and  WBS  element 
pairwise  correlations.  The  ACOP  model  assumes  that  the 
level  2  prime  mission  equipment  is  a  normally  distributed 
variable  with  all  other  level  2  cost  elements  correlated  to 
it  (10:3-10). 

MIL-STD-aaiA  defines  prime  mission  equipment  for 

electronic  systems  as: 

The  prime  mission  equipment  element  refers 
to  the  equipments  and  associated  computer 
programs  used  to  accomplish  the  prime 
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mission  of  the  defense  materiel  item.  Those 
support  equipments  and  services  vital  to  the 
operation  and  maintenance  of  the  system,  but 
not,  integral  with  the  prime  function  of  the 
system  are  excluded.  (17:34) 

Garvey  states  that  if  the  prime  mission  equipment  cost 
dominates  the  cost  of  all  other  level  2  WBS  elements,  then 
it  can  be  assumed  that  total  system  cost  is  approximately 
normal  (10:1-11).  The  assumption  of  a  normally  distributed 
variable  is  the  key  limiting  factor  with  the  ACOP  model. 

The  cost  analyst  must  assume  a  shape  for  the  total  cost 
distribution. 

Garvey  provides  an  appendix  with  proofs  for  all  theorems 
used  throughout  the  model.  The  author  also  provides  an 
example  to  illustrate  the  methodology  (10:1-11). 

Garvey's  model  alleviates  the  necessity  of  using 
simulation  methods.  The  ACOP  model  and  the  Tecolote  Risk 
model  are  similar  in  that  they  allow  for  the  input  of  WBS 
element  correlation.  The  model's  ability  to  include  WBS 
correlations  should  provide  better  program  cost  estimates 
(10:1-11).  However,  the  cost  analyst  must  understand  the 
limitations  of  the  model  discussed  in  the  conclusion  of  this 
chapter . 

Formal  Risk  Evaluation  Methodology  (FRISKEM) .  Abramson 
and  Young  define  a  model  which  may  be  used  to  evaluate 
multiple  program  options  including  risk  analysis.  This 
model  is  called  the  Formal  Risk  Evaluation  Methodology 
(FRISKEM).  They  also  discuss  the  possibility  to  generalize 
the  model  for  standard  risk  analysis.  However,  different 
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assumptions  (these  asstunpticn  are  not  developed  in  the 
paper)  about  the  element  distributions  must  be  made  (1:1-9). 

Abramson  and  Young  define  the  FRISKEM  as  a  model  which 
assumes  that  lower  level  WBS  cost  element  distributions  are 
triangular  cost  distributions.  The  sum  (total  cost 
distribution)  of  these  lower  level  elements  is  assumed  to  be 
a  log-normal  distribution.  This  model  has  been  developed  to 
compare  competing  program  solutions  to  the  same  general 
problem  (1 : 8-9) . 

Simulation  Cost  Risk  Methodologies 

General  Monte  Carlo  Methods.  Law  and  Kelton  define  the 

Monte  Carlo  simulation  method  as 

a  scheme  employing  random  numbers,  that  is,  U(0,1) 
random  variables,  which  is  used  for  solving  certain 
stochastic  or  deterministic  problems  where  the 
passage  of  time  plays  no  substantive  role.  Thus, 

Monte  Carlo  simulations  are  generally  static  rather 
than  dynamic.  (14:49) 

Dienemann  describes  the  Monte  Carlo  technique  required 
for  cost  uncertainty  analysis.  The  model  requires  that  the 
user  input  the  summary  statistics  of  the  WBS  elements.  He 
uses  the  Monte  Carlo  method  to  generate  samples  from  that 
distribution.  The  samples  are  summed  by  convolution  to 
generate  the  total  cost  distribution.  These  methods  are 
developed  and  an  example  of  usage  is  shown.  The  model 
assumes  that  all  WBS  elements  are  statistically  independent 
(9:1-27) . 

Monte  Carlo  Convolution .  From  Jago  and  Book  &  Young, 
convolution  is  used  in  Tecolote  Risk  model  by  summing  the 
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random  deviates  from  each  lower  level  WBS  cost  distribution 


to  form  the  total  cost  distribution  (12:12;  4:14). 

According  to  Murphy,  one  random  deviate  from  a  cost  element 
distribution  represents  one  sample  of  cost  from  that 
distribution.  Thus  the  total  cost  distribution  shape  is 
formed  by  the  summation  of  all  lower  level  cost  element  cost 
samples  1000  (default  value  in  the  Tecolote  Risk  Model) 
times  ( 18 ) . 

Correlated  Monte  Carlo  Methods.  Johnson  describes  the 
use  of  the  Cholesky  decomposition  (factor)  for  normally 
distributed  variable  in  Monte  Carlo  models.  The  method 
described  generates  correlated  normal  variates  from 
independent  normal  variates.  He  states  that  this  method 
will  only  work  when  the  correlation  matrix  is  nonsingular 
(that  is  it  is  invertible).  Johnson  also  indicates  that  the 
Cholesky  factor  is  not  unique.  There  are  other 
factorizations  which  solve  AA'  =  2,  where  2  is  the 
correlation  matrix  (13:52-55). 

Tecolote  Risk  Model  Overview.  The  Tecolote  Risk  Model 
has  been  designed  to  consider  four  types  of  uncertainties. 
These  are  estimating  (cost  risk),  schedule,  technology,  and 
configuration  uncertainties  (12:4-5).  This  research  will 
concentrate  on  cost  estimating  risk  exclusively. 

The  Tecolote  Risk  Model  requires  the  following  inputs 
for  each  WBS  element  or  major  subsystem:  most  likely  cost, 
highest  likely  cost,  lowest  likely  cost,  distribution  type 
(beta,  triangular,  and  uniform),  standard  deviation,  and 
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subsystem  pairwise  correlations  (12:3-12).  The  Tecolote 
Risk  Model  is  a  Monte  Carlo  model  that  uses  Cholesky 
decomposition  to  transform  independent  random  deviates  to 
dependent  random  deviates  according  to  the  user  specified 
correlations.  The  Monte  Carlo  random  deviates  are  shaped  by 
the  user  defined  input  parameters  into  the  Tecolote  Risk 
Model.  The  Cholesky  decomposition  is  applied  to  the  user 
defined  WBS  cost  element  correlation  matrix.  The  Cholesky 
decomposition  will  be  further  discussed  in  Chapter  III.  The 
Cholesky  decomposition  is  a  numerical  method  that  factors  a 
symmetric  positive  definite  matrix  into  upper  and  lower 
triangular  matrices  (11:141-146).  Positive  definiteness 
will  be  discussed  in  Chapter  III.  The  Tecolote  Risk  Model 
uses  convolution  to  generate  the  total  program  cost 
distribution  (12:11-12), 

The  Cholesky  factor  (an  n  x  n  matrix)  is  postmultipiied 
by  the  independent  Monte  Carlo  random  deviates.  This  forms 
correlated  Monts  Carlo  random  deviates  (12:11).  The  first 
distribution  is  never  changed,  but  all  subsequent  cost 
element  distribution  shapes  are  changed  dependent  on  the 
pairwise  correlation  defined  by  the  user.  If  the 
correlation  matrix  is  the  identity  matrix  (meaning  that  the 
distributions  are  independent),  then  the  post  factored 
distributions  are  identical  to  the  pre-factored 
distributions . 

The  correlated  Monte  Carlo  draws  are  summed  to  form  the 
total  cost  distribution  (12:12).  This  final  distribution 
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forms  a  p.d.f.  and  c.d.f.  From  the  total  cost  distribution, 
the  decision  maker  may  select  the  confidence  level  and  thus 
the  cost  estimate  he/she  wishes  to  report. 

Conclusions 

The  problem  with  using  the  analytical  risk  analysis 
techniques  (ACOP  and  FRISKEM)  discussed  above  is  that  both 
models  assume  a  specific  shape  for  the  total  cost 
distribution.  This  limits  the  applicability  of  the  model  to 
a  subset  of  all  possible  total  cost  distributions.  The 
Tecolote  Risk  Model  does  not  assume  a  shape  about  the  total 
cost  distribution. 

The  ACOP  model  may  be  used  for  situations  where  the 
prime  mission  equipment  is  normally  distributed  and 
dominates  all  other  cost  elements  correlated  to  it.  ACOP 
will  not  be  further  discussed  in  this  research. 

The  Formal  Risk  Evaluation  Methoaology  (FRISKEM)  is  a 
potential  methodology  for  comparisons  of  multiple  program 
solutions.  However,  an  evaluation  of  the  assumptions  of 
distribucion  types  would  be  required  for  departure  from 
those  rigid  guidelines.  FRISKEM  will  not  be  further 
discussed  in  this  research. 

The  use  of  a  simulation  model  appears  to  be  the  most 
appropriate  approach  to  cost  risk  analysis.  The  major 
concern  about  the  Tecolote  Risk  Model  is  that  it  uses  valid 
correlation  methodologies.  The  rholesky  decomposition  will 
be  more  fully  explored  and  developed  in  chapter  III. 
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III.  Methodology 


Overview 

This  chapter  will  provide  an  overview  of  the  Tecolote 
Risk  Methodology  and  how  the  methodology  is  implemented  in 
the  Comparison  Model.  The  risk  methodology/model  validation 
and  verification  methodology  will  be  described  in  this 
chapter . 

The  methodology  developed  in  this  research  is  general 
and  may  be  applied  to  all  cost  risk  models  which  consider 
cost  element  dependencies.  However,  the  methodology  is 
applied  in  this  research  specifically  to  the  Tecolote  Risk 
Model . 

The  Tecolote  Risk  Model 

The  Tecolote  Risk  Model  generates  uniform  random 
deviates,  forms  these  into  user  defined  p.d.f.s  (beta, 
triangular,  and  uniform),  standardizes  (normalizes)  the 
random  deviates,  computes  the  Cholesky  lower  triangle 
factor,  postmultiplies  the  Cholesky  factor  by  the 
standardized  random  deviates,  and  then  destandardizes  them 
to  form  the  post  factored  distributions.  The  total  cost  of 
all  lower  level  cost  elements  is  calculated  by  convolution. 
In  simulation  models  convolution  is  simply  the  addition  of 
the  vectors  of  random  deviates  (14:249-250).  According  to 
the  Air  Force  Systems  Conmand  Handbook ,  each  random  deviate 
is  a  cost  sample  (also  referred  to  as  draw)  from  its 
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respective  distribution.  Therefore,  if  the  model  has  two 
cost  elements,  then  one  sample  from  the  first  distribution 
of  cost  is  added  to  the  corresponding  sample  from  the  second 
distribution  of  cost.  The  summation  of  one  sample  from  each 
WBS  cost  element  distribution  is  a  sample  of  cost  from  the 
total  cost  distribution.  The  collection  of  these  samples 
forms  the  total  cost  distribution  (2:13-29  to  13-32).  The 
Tecolote  Risk  Model  default  number  of  random  deviates  is 
1000  with  a  maximum  number  of  random  deviates  of  9,999 
(12:45) . 

Correlation  Matrix 

Positive  Definite  Matrices.  Stoer  and  Bulirsch  define 

positive  definite  matrices  as  follows: 

A  n  X  n  matrix  C  is  said  to  be  positive  definite  if 
it  satisfies: 

(a)  C,  =  C^,  i.e.,  C  is  a  Hermitian  matrix. 

(b)  x‘Cx  >  0  for  all  x  6  C? ,  x  #  0.  (24:172-173) 

A  Hermitian  matrix  C  is  positive  definite  (positive 
semidef inite)  if  and  only  if  all  eigenvalues  of  C 
are  positive  (nonnegative).  (24:330) 

Searle  states  that  the  correlation  matrix  is  non¬ 
negative  definite  (either  positive  semi-definite  or  positive 
definite).  This  is  due  to  the  fact  that  all  variances  in 
the  variance-covariance  matrix  are  0  or  positive  (23:347- 
349).  Searle  states  that  "symmetric  matrices  are  a  subset 
of  Hermitian  matrices"  (23:342).  Since  the  correlation 
matrix  (C)  is  symmetric,  it  is  also  known  to  be  Hermitian. 
Thus,  testing  for  positive  definiteness  (positive 
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semidef inite)  is  simply  a  computation  of  the  correlation 
matrix  eigenvalues.  According  to  Stoer  and  Bulirsch,  if  all 
the  eigenvalues  are  strictly  positive  (non-negative),  then 
the  matrix  is  positive  definite  (positive  semidef inite) 
(24:330).  Thus  the  test  for  valid  correlation  matrices  is 
simply  a  calculation  of  the  correlation  matrix  eigenvalues. 
If  all  eigenvalues  are  non-negative,  then  the  correlation 
matrix  is  valid. 

Cholesky  Decomposition.  The  correlation  matrix,  C,  is 

defined  to  be  a  real  symmetric  positive  definite  matrix. 

Then,  Stoer  and  Bulirsch  describe  the  Cholesky  decomposition 

(also  referred  to  as  Cholesky  factorization  in  some  texts) 

as  the  operation  that  results  in  finding  L,  the  lower 

triangle  factor  matrix  of  a  symmetrical  positive  definite 

matrix,  C.  Then  C=LL‘  where  L  is  the  lower  triangle  factor 

and  L‘  is  the  upper  triangle  factor.  Formally,  Stoer  and 

Bulirsch  define  Cholesky  decomposition  as  follows: 

For  each  n  x  n  positive  definite  matrix  C 
there  is  a  unique  n  x  n  lower  triangular 
matrix  L  (1;;^  =  0  for  k  >  i)  with  >0,  i 
=  l,2,...,n‘,  satisfying  C  =  Ll".  it  C  is 
real,  so  is  L.  (24:174) 

Specifically,  if  C  is  defined  to  be  a  real  3x3 
correlation  matrix,  then  the  Cholesky  decomposition  is: 
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Thus  by  linear  algebra: 
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Thus  the  C  matrix  is  factored  into  the  lower  triangle  matrix 
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L  is  postmul tiplied  by  the  independent  random  deviates  to 
form  correlated  random  deviates.  It  is  the  Cholesky 
decomposition  that  enables  the  Tecolote  Risk  Model  to  form 
correlated  distributions  of  cost.  In  a  two  WBS  element 
case,  the  correlation  is  done  by  correlating  the  second 
distribution  to  the  first.  Thus,  the  first  distribution 
remains  constant,  while  the  second  distribution  is 
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transformed  to  a  correlated  (to  the  first  distribution) 
distribution  shape. 

Tecolote  Risk  Model  Verification  (<7ith  the  Comparison  Model 

Overview.  The  Tecolote  Risk  Model  computer  program  was 
not  available  for  this  research.  Therefore,  the  same 
procedures  as  outlined  above  for  the  Tecolote  Risk  Model 
have  been  implemented  in  the  Comparison  Model.  The 
Comparison  Model  is  then  used  to  verify  the  implementation 
of  the  Tecolote  Risk  Model. 

Comparison  Model  Design.  The  Comparison  Model  was 
developed  in  Quattro  Pro  3.0  using  a  80386DX  20Mhz  IBM  AT 
compatible.  The  computer  is  equipped  with  2  MByte  of  RAM 
and  a  67  MByte  hard  disk.  The  model  is  limited  to  2  WBS 
cost  elements  and  triangular  cost  distributions.  The  model 
serves  two  useful  purposes.  The  first  is  to  compare  the 
cos^  distributions  before  and  after  the  Cholesky 
decomposition  as  the  Tecolote  Risk  Model  does  not  allow  this 
visibility.  Second,  if  the  Tecolote  Risk  Model  fails  any 
test,  then  the  same  test  can  be  applied  to  the  Comparison 
Model  to  determine  if  the  failure  is  with  the  methodology  or 
the  implementation. 

Following  is  a  step-by-step  procedure  of  how  the  model 
was  designed,  including  how  the  tests  and  other  statistics 
were  gathered.  Book  &  Young  and  Jago  are  the  primary 
sources  of  information  in  designing  the  Comparison  Model 
(4:1-19;  12:1-12). 
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Independent  Uniform  Random  Deviate  Generation.  Markland 
describes  the  process  to  generate  the  independent  uniform 
random  deviates  from  a  pseudorandom  number  generator.  The 
Comparison  Model  uses  the  multiplicative  congruential 
method.  The  general  form  for  this  method  is: 

■  KXgimodulo  m)  (15) 

where  K  =  5"^  =  1,220,703,125,  m  =  (2^^  -  1)  =  2,147,483,647. 
For  the  purposes  of  this  model,  was  chosen  to  be  10,000 
(random  seed  1).  This  pseudorandom  number  generator 
generates  an  independent  string  of  random  digits  with  a 
period  of  2,147,483,647  (15:609-610).  All  cases  have  been 
tested  with  4  other  random  niimber  seeds.  The  seeds  are 
1,589,823,392  (random  seed  2),  776,519,062  (random  seed  3), 
1,817,216,169  (random  seed  4),  and  641,504,206  (random  seed 
5).  This  was  done  to  verify  that  the  results  are 
independent  of  the  random  number  generator.  All  five  seeds 
are  statistically  independent  from  each  other.  That  is,  the 
three  lists  of  random  deviates  do  not  contain  the  exact  same 
random  deviate  in  any  other  list.  The  Comparison  Model 
generates  2000  (1000  random  draws  for  each  of  the  two 
distributions)  random  numbers  with  a  single  random  seed  such 
that  independence  is  guaranteed. 

The  random  deviates  are  then  divided  by  2,147,483,647  to 
form  0-1  uniform  random  deviates.  These  are  then  used  to 
generate  the  triangular  random  deviates  needed  to  run  the 
risk  model . 


37 


Law  and  Kelton  describe  a  triangular  distribution 
generation  method  using  0-i  uniform  random  deviates.  The 
following  equation  was  used  for  this  task; 

If  Uic,  then  X  =  ^  else  X  =  l-^{l-c)  il-U)  (16) 

where  U  represents  the  uniform  random  deviate  draw  and  c  is 
the  mode  of  the  triangular  distribution.  X  represents  a 
single  Monte  Carlo  draw,  which  is  replicated  1000  times  for 
each  distribution.  The  result  is  a  1  x  1000  vector  for  each 
cost  element.  This  equation  generates  a  0-1  triangular 
distribution  (14:261).  To  make  the  distributions  closer  to 
a  real  application,  the  0-1  distribution  is  multiplici  by  a 
scalar  value  of  1000.  Thus  the  Comparison  Model  generates 
two  triangular  distributions  with  a  range  of  0  to  1000. 

The  Comparison  Model  is  designed  to  account  for 
differences  of  scale  using  a  standardization  (normalization) 
technique.  Book  and  Young  describe  the  standardized  Z- 
scores  by  the  following  equation: 

z  -  (17) 

Oj 

where  j  represents  the  WBS  cost  element  and  k  is  the  Monte 
Carlo  random  deviate.  The  Z  score  is  a  standardized  random 
deviate  generated  by  the  Monte  Carlo  method  random  number 
generator  (4:7-8). 

A  note  on  the  practical  application  of  the 
standardization  process  is  that  it  will  maintain 
distributions  in  their  proper  proportion.  That  is  if 
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distribution  one  has  a  range  of  0  to  1000  with  a  mode  of  750 
and  distribution  two  has  a  range  of  0  to  100  with  a  mode  of 
75,  then  these  two  distributions  could  be  highly  correlated 
even  though  they  have  different  ranges. 

After  the  random  deviates  are  generated  and 
standardized,  the  next  step  is  the  Cholesky  decomposition. 

Cholesky  Decomposition .  The  correlation  matrix  must  be 
shown  to  be  positive  definite  before  continuing  with  the 
factorization.  This  research  is  limited  to  2  WBS  elements. 
All  potential  2x2  correlation  matrices  are  positive 
definite  except  when  the  correlation  equals  exactly  -1  or  1; 
then  the  correlation  matrix  is  positive  semi-definite.  A 
positive  semi-definite  matrix  is  a  valid  correlation  matrix; 
however,  it  does  not  have  a  corresponding  Cholesky  factor. 
Therefore,  this  research  must  limit  itself  to  positive 
definite  matrices.  The  Cholesky  decomposition  correlation 
process  only  affects  the  2nd  of  these  two  elements.  In  this 
research  the  second  element  is  referred  to  as  the  non-pivot 
element . 

The  2x2  correlation  matrix,  C,  studied  in  the 
Comparison  model  is: 

PzijPil  0  in  Ijl 

[p2i  ^  J  ®  -^22, 

thus  the  matrix  C  is  factored  into  the  lower  triangle  matrix 
L  and  is  multiplied  by  the  standardized  Z  score  vector  to 
form  the  correlated  Z*  score  vector. 

39 


(19) 


The  Comparison  Model  then  uses  the  reverse 
standardization  process  to  form  the  post  factored 
distributions.  This  is  accomplished  by  general  methods  as 
follows : 

x/jc  -  OjZ/jt  +  (20) 

The  resulting  x*'s  are  correlated  random  Monte  Carlo  draws. 
The  collection  of  2000  (1000  for  each  distribution)  of  the 
two  vectors  forms  the  post  factored  distributions.  Note 
again  that  distribution  1  is  exactly  the  same  as  the  pre¬ 
factored  distributions  and  distribution  2  has  changed 
depending  on  the  correlation  assigned  between  distribution  1 
and  2 . 

The  Comparison  model  then  sums  (convolution)  the  cost 
element  distribution  vectors.  This  forms  the  total  cost 
distribution. 

The  Comparison  Model  uses  Quattro  Pro’s  frequency 
command  to  form  a  histogram  (p.d.f.)  of  the  cost  elements 
and  the  total  cost  distributions.  The  p.d.f.s  are  then 
summed  to  form  the  c.d.f.s  of  each  distribution. 

One  characteristic  noted  on  Quattro  Pro's  frequency 
distribution  is  that  if  the  interval  reports  3  occurrences 
in  the  75  to  100  interval,  the  3  occurrences  actually  occur 
in  the  range  76  to  100.  This  has  not  been  adjusted  for, 


since  the  model  is  for  comparison  uses  only.  If  this  model 
were  to  be  used  for  an  actual  cost  risk  analysis,  then  the 
median  of  the  intervals  should  be  used.  Since  this  model 
affects  both  pre  and  post  factored  distributions  in  the  same 
manner,  the  affect  is  nullified  and  the  test  remains  valid. 

Other  statistics  that  are  recorded  for  each  distribution 
pair  are  as  follows:  Total  Cost  Distribution's  mean, 
variance,  standard  deviation,  1st  non-zero  point,  and  1st 
1000  point.  The  model  also  gathers  statistics  on  the  pre 
and  post  factored  distribution  2  as  follows:  skewness, 
mode,  1st  non-zero  point,  and  1st  1000  point.  The  1st  non¬ 
zero  point  and  1st  1000  point  are  used  to  define  the  range 
of  the  distributions. 

Comparison  Model  Verification,  Several  tests  were 
performed  on  the  Comparison  Model  to  verify  that  it  was 
properly  implemented.  The  first  is  a  verification  that  the 
uniform  independent  random  deviates  are  actually  uniform. 

The  second  verification  test  is  observing  the  shape  of  the 
distributions  after  the  shape  factors  have  been  applied. 

The  third  test  is  verifying  that  the  mean  and  variance  are 
similar  to  the  analytical  solutions.  The  fourth  test  is 
verifying  that  the  correlated  random  deviates  are  indeed 
correlated.  The  Comparison  Model  passed  all  four  tests  and 
this  verified  the  methodology's  implementation. 

For  Comparison  Model  verification,  one  test  case  was 
verified  to  have  the  user  defined  correlation.  The  post 
Cholesky  decomposition  random  deviates  were  tested  by  using 
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SAS^  PROC  CORR  (22:258-261)  for  one  correlation  value. 

Other  correlations  were  tested  visually  by  graphing  the  post 
Cholesky  random  deviates  as  an  XY  plot.  The  user  defined 
correlations  were  indeed  maintained  for  the  one  SAS*  test 
case  and  the  visual  test  over  a  range  of  correlations. 

The  Verification  Process.  The  Comparison  Model  is  used 
to  verify  the  implementation  of  the  Tecolote  Risk  Model . 

The  mean,  variance  and  end  points  of  the  total  cost 
distributions  were  examined. 

Validation  of  the  Tecolote  Risk  Methodology 

Overview.  This  section  covers  three  topics  for  the 
Tecolote  Risk  Methodology  validation  process.  First,  the 
test  data  used  to  validate  the  methodology  is  described. 
Secondly,  the  selection  of  logically  consistent  correlations 
for  the  test  cases  (passing  Failure  Mode  1  from  Figure  3)  is 
made.  Thirdly,  the  three  criteria  described  in  Chapter  I 
(Failure  Mode  2)  are  formally  defined. 

Data.  The  data  that  will  be  tested  in  this  thesis  are 
25  pairs  of  triangular  distributions.  The  triangular 
distribution  is  one  of  three  possible  types  of  distributions 
allowed  in  the  Tecolote  Risk  Model  (the  Beta  and  Uniform  are 
the  other  two  types).  There  will  be  five  test  distributions 
in  all.  All  five  distributions  range  from  0  to  1000.  The 
modes  of  the  five  distributions  are:  0,  250,  500,  750  and 
1000.  The  distributions  will  be  tested  against  themselves 
and  each  other  resulting  in  the  25  (5  x  5  =  25)  test  cases. 
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Since  the  total  cost  distribution  is  sensitive  (Cholesky 
decomposition  affect)  to  the  order  of  lower  level  cost 
element  distributions  (4:17);  both  distribution  1  v,-*^**  0  - 
Mode  250  -  High  1000),  distribution  2  (0-750-1000)  and 
distribution  1  (0-750-1000),  distribution  2  (7  2.O-1000) 
will  be  tested.  It  is  important  to  remember  that  the  first 
cost  distribution  is  fixed,  while  the  second  distribution  is 
altered.  The  correlation  coefficient  (p)  will  be  allowed  to 
vary  from  -0.9  to  +0.9  in  0.1  increments  for  each  of  the  25 
cases.  The  twenty-five  cases  are  shown  in  Table  1. 
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Table  1  Test  data 


TEST  CASE 

CASE  1 

CASE  2 

CASE  3 

CASE  4 

CASE  5 


MODE  0  I  MODE  250 


UNCERTAIN 
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Table  1  Test  data  continued 


TEST  CASE 

CASE  6 

CASE  7 

CASE  8 

CASE  9 

CASE  10 


DISTRIBUTION 

1 


DISTRIBUTION 

2 
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Table  1  Test  data  continued 


Table  1  Test  data  continued 


TEST  CASE 

CASE  16 

CASE  17 

CASE  18 

CASE  19 

CASE  20 


DISTRIBUTION  DISTRIBUTION  CONSISTENCY 

1  2  TEST  lb 


MODE  750 


MODE  1000 


Table  1  Test  data  continued 


TEST  CASE 

CASE  21 

CASE  22 

CASE  23 

CASE  24 

CASE  25 


MODE  1000  I  MODE  250 


UNCERTAIN 
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Consistent  Input  Parameters  (User's  Burden).  The 
analyst  must  ensure  that  the  input  data  is  consistent.  The 
user  of  any  dependent  cost  element  risk  analysis  methodology 
must  specify  internally  consistent  correlations.  There  are 
two  restraints  that  the  analyst  must  be  concerned  with.  The 
first  is  that  the  correlation  matrix  must  be  positive 
semidefinite.  The  second  is  that  the  cost  element 
distributions  are  logically  consistent  in  relation  to  the 
properly  specified  correlations. 

Although  the  only  mathematical  restriction  on  the 
correlation  matrix  is  that  it  must  be  positive  semidefinite, 
user's  of  the  Tecolote  Risk  Model  must  test  the  correlation 
matrix  for  positive  definiteness  due  to  the  use  of  the 
Cholesky  decomposition.  The  test  for  positive  definiteness 
is  as  described  earlier  in  this  chapter.  It  is 
straightforward  and  easily  accomplished.  This  is  referred 
to  as  Failure  Mode  la  in  Figure  3. 

Logically  consistent  distributions  in  relation  to 
specified  correlations  is  a  more  intuitive  exercise.  This 
is  referred  to  as  Failure  Mode  lb  in  Figure  3.  According  to 
Murphy,  two  distributions  that  are  identically  distributed 
can  be  independent  or  positively  correlated.  Any  negative 
correlation  between  two  identically  distributed  cost 
elements  is  illogical.  The  user  should  expect  for  two 
correlated  identically  distributed  cost  variables  that  if 
one  cost  element  increases  in  cost  then  the  second  cost 
element  should  also  increase  in  cost.  The  change  in  cost 
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should  shift  in  the  same  direction.  Of  course  if  the  cost 
elements  are  independent,  then  the  direction  of  changes  in 
cost  between  the  two  cost  elements  is  not  predictable.  Two 
distributions  that  are  opposed  should  be  logically 
consistent  for  all  negative  correlations.  Any  positive 
correlation  between  opposed  distributions  should  be 
logically  inconsistent  (18). 

However,  there  is  a  much  larger  "gray”  area  of 
distributions,  which  does  not  have  an  obvious  determination 
of  consistency.  There  is  uncertainty  in  what  correlation 
range  should  exist  between  two  non-identical ly  distributed 
cost  elements.  If  both  cost  elements  are  skewed  right,  but 
not  identical,  over  what  range  may  the  correlation  vary  and 
still  be  consistent?  This  is  a  subjective  question  left  to 
future  research. 

The  selected  test  cases  that  are  considered  to  be 
identically  distributed  will  also  be  tested  at  -t-0.99 
correlation.  Cases  that  are  considered  to  be  opposed  will 
be  tested  at  -0.99.  A  -1  or  +1  correlation  coefficient 
cannot  be  tested  because  the  correlation  matrix  is  not 
positive  definite. 

The  validation  of  the  Tecolote  Risk  Methodology  requires 
logically  consistent  inputs.  The  selected  cases  that  are 
considered  to  be  consistent  are  cases  1,  5,  7,  9,  13,  17, 

19,  21  and  25  as  displayed  in  Table  1.  In  addition,  case  20 
will  be  discussed  and  compared  to  the  results  of  the  other 
cases.  Case  20  should  have  some  positive  correlation  range 
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(absolute  range  is  xincertain)  since  they  are  both  left 
skewed  distributions.  Colunm  4  of  Table  1  displays  the 
correlations  that  are  considered  to  pass  the  user's  burden 
criteria  from  Figure  3.  All  distributions  pass  when  the 
distributions  are  statistically  independent;  therefore  0 
correlation  is  not  noted  in  the  table. 

Validation  Methodology  (Methodol ogy 's  Burden).  Failure 
mode  2  requires  the  application  of  a  set  of  tests  or 
criteria  on  the  cost  risk  methodology  applied  to  properly 
specified  input  parameters.  Criterion  2a  is  concerned  with 
the  input  and  output  cost  element  correlations.  Criterion 
2b  is  concerned  with  the  total  cost  distribution  statistics. 
Criterion  2c  is  concerned  with  the  cost  element  distribution 
shape . 

Criterion  2a  states  that  the  user  specified  correlation 
matrix  must  be  maintained  through  the  cost  risk  methodology. 
To  verify  that  the  output  correlations  are  equal  to  the 
input  correlation,  simply  verify  mathematically  the  output 
correlation.  If  the  output  correlation  equals  the  user 
specified  correlation,  then  the  methodology  is  valid.  The 
mathematical  approach  is  the  best  validation  process. 
However,  by  determining  the  correlation  between  the 
correlated  distribution  random  deviates  the  user  may  also 
verify  the  methodologies'  implementation.  If  the  output 
correlations  are  equal  to  the  user  specified  correlations, 
then  the  model  passes  this  criterion. 
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Book  and  Young  showed  that  the  correlations  specified  by 
the  user  are  maintained  through  the  Cholesky  decomposition 
(4:11). 

Criterion  2b  is  the  validation  of  the  model's  summary 
statistics  for  total  cost.  This  criterion  will  use 
equations  3  and  4  described  in  Chapter  I.  This  will  be  the 
validation  of  the  methodology  against  the  analytical 
solution.  The  mean  and  variance  of  the  stun  of  distributions 
are  easily  computed. 

Criterion  2c  will  test  the  change  in  shape  of  the  second 
distribution  as  a  function  of  the  input  correlation.  This 
test  will  be  accomplished  with  the  use  of  the  Chi-square 
goodness  of  fit  test.  Although  all  correlations  are  tested 
from  -0.9  to  0.9,  the  only  correlations  that  will  be 
discussed  in  Chapter  IV  are  those  that  are  logically 
consistent . 

The  criterion  2c  hypothesis  test  is: 

Hg:  The  postfactored  second  WBS  cost  element 

distribution  is  equivalent  to  the  user  input 
second  HBS  cost  element  distribution 

Hj:  Reject  Hg  if  x\e3t  >  X^7df,  0.01 

The  Chi-square  goodness  of  fit  test  uses  the  pre  and 
post  factored  probability  density  fxinctions  (p.d.f.)  from 
distribution  2.  The  p.d.f.s  arc  divided  into  a  total  of  18 
classification  intervals.  Sixteen  of  the  intervals  are  of 
size  50  and  the  remaining  two  are  -infinity  to  100  and  900 
to  +infinity.  The  classification  interval  definition 
results  in  a  17  degrees  of  freedom  Chi-square  hypothesis 


52 


test.  The  Chi-square  goodness  of  fit  test  is  sensitive  to 
the  size  of  the  classification  interval  (14:196-197). 
Therefore,  the  Chi-square  test  will  also  be  tested  with  10 
and  34  classification  intervals. 

Newbold  defines  the  Chi-square  test  as  follows: 

a.  y  ~  (21) 

1-1 

Where  0^  is  the  observed  frequency  distribution,  £,•  is  the 
expected  frequency  distribution  and  K  is  the  number  of 
intervals  (20:414). 

In  the  case  of  the  Comparison  Model ,  the  expected 
frequency  is  the  pre-factored  distribution  2  and  the 
observed  frequency  is  the  post-factored  distribution  2. 

In  addition  to  the  Chi-square  test,  this  research  will 
exhibit  the  (0^  -  E^)  /E^'s  from  the  Chi-square  test  and  the 
boundary  charts  (footprint)  of  the  distributions  as  an 
analysis  tool.  The  footprint  or  boundary  graphs  exhibit  the 
maximum,  minimum  and  mode  of  the  pre  and  post  factored 
second  cost  element  distribution. 

The  Chi-square  goodness  of  fit  test  could  also  be 
sensitive  to  the  random  number  seed.  Therefore,  the  random 
number  seed  was  varied  for  the  twenty-five  test  cases. 

If  the  methodology  passes  criteria  2a,  2b  and  2c,  then 
the  methodology  is  valid.  It  is  possible  that  the 
methodology  is  valid  only  under  certain  conditions.  These 
conditions  will  be  described  in  Chapter  IV. 
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Validity 

The  internal  validity  of  this  validation  methodology  is 
shown  by  the  tests  that  were  done.  Extreme  cases  of 
distributions  were  tested.  These  are  pairwise  comparisons 
of  the  before  and  after  factorization  process.  The  testing 
methodology  described  in  this  chapter  could  be  extended  to 
multiple  WBS  cost  elements.  This  paper  limits  the  number  of 
WBS  elements  to  two  for  ease  of  analysis.  That  is,  it  is 
difficult  to  determine  causality  of  a  more  complex  WBS 
structure . 

The  Chi-square  goodness  of  fit  test  is  a  commonly  used 
statistical  test  for  comparison  of  distribution  shapes 
(20:412-413),  This  test  should  give  the  user  some 
quantitative  reason  for  limiting  correlations  given  a  set  of 
cost  element  distribution  shapes.  Three  interval  sizes  were 
evaluated  to  reduce  Type  I  errors.  A  type  I  error  is 
rejecting  the  null  hypothesis  when  the  null  hypothesis 
should  not  be  rejected.  (20:332). 

Analysis 

The  Chi-square  goodness  of  fit  statistic  were  plotted 
versus  the  correlation  coefficient.  If  the  distribution 
fails  the  Hj,  then  an  investigation  of  why  it  failed  must  be 
made.  The  determination  of  where  it  failed  was  done  with 
two  other  sets  of  data.  The  first  is  the  footprint  or 
boundary  graph  of  the  minimum,  maximum  and  mode  of  the  pre 
and  post  factored  second  distribution.  The  second  is  the 
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analysis  of  the  (0^  -  E^)Ve<'s  (referred  to  as  interval 
statistic)  for  all  the  classification  intervals  in  the  Chi- 
square  test. 

Concl us ions 

The  above  methodology  may  be  applied  to  any  dependent 
cost  risk  analysis  model.  Specifically,  this  analysis  will 
provide  the  Tecolote  Risk  model  user  with  the  ability  to 
know  the  limitations  of  the  cost  risk  model  and  verify 
implementation.  The  verification  should  be  done  in  a 
sequence.  First,  verify  the  input  parameters  are  internally 
valid  and  then  determine  if  the  input  parameters  are  valid 
within  the  Tecolote  Risk  Model  restrictions.  If  both 
conditions  are  met,  then  the  analyst  has  some  level  of 
confidence  that  the  input  parameters  are  consistent. 
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IV.  Analysis 


Overview 

This  chapter  applies  the  research  methodology  developed 
in  Chapter  III.  All  data  (in  graphical  format)  that  was 
generated  from  the  Comparison  Model  is  included  in  the 
appendices.  Cases  1  and  20  are  reproduced  here  as  well  as 
in  the  appendices  for  clarity  of  discussion.  The  Tecolote 
Risk  Methodology  was  tested  against  the  Failure  Mode  2 
criteria  as  developed  in  Chapter  III  using  valid  input  test 
parameters . 

To  reiterate,  the  validation  of  the  Tecolote  Risk 
Methodology  is  obtained  either  analytically  or  by  simulating 
the  result  with  the  Comparison  Model.  The  verification 
process  reters  to  testing  the  Tecolote  Risk  Model  (Air  Force 
Risk  Model,  "riskmain.exe"  dated  18  February  1991). 

Methodology  Validation 

The  Tecolote  Risk  Model  source  code  was  not  available 
for  this  research;  therefore  other  means  had  to  be 
implemented  in  the  validation  process.  The  validation 
criteria  are  applied  either  mathematically  or  using  the 
output  of  the  Comparison  Model.  Validation  criterion  2a  is 
applied  mathematically.  Validation  criteria  2b  and  2c  are 
applied  through  the  Comparison  Model. 

Criterio.i  2a  -  Correlation  Coefficient.  Criterion  2a 
states  that  the  user  defined  WBS  cost  element  correlations 
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should  be  maintained  through  the  cost  risk  model  (i.e., 
input  p  =  output  p).  In  other  words,  the  correlation 
between  elements  1  and  2  should  be  p^2-  Book  and  Young 
showed  that  the  correlation  between  WBS  cost  elements  is 
maintained  through  the  Cholesky  decomposition.  From 
equation  19,  the  correlation  between  and  Z2*  may  be 
verified: 

CORR  ( ,  Zy)  ‘CORR  ( ,  Pj^Zj  ♦V^-Pai^2 )  ^  ^  ^  ^ 

-  CORR(z^,p^Z^)  *■  CX)RR ( -Pia^j ) 

»  pjj  CORR(z^,z^)  +^l-pliCORR(z^,  z^)  -  p^^ 

COR]R(zl,zl)  equals  1  and,  since  the  z  scores  are  generated 
independently,  CORR(2l,z2)  equals  0.  Equation  22  shows  that 
the  user  defined  correlation  are  maintained  through  the 
Cholesky  decomposition.  Book  and  Young  include  in  their 
documentation  that  other  pairs  of  cost  elements  maintain 
their  correlation  (4:11).  The  Tecolote  Risk  Methodology 
passes  criterion  2a. 

Criterion  2b  -  Mean  and  Variance.  Criterion  2b  states 
that  the  total  cost  mean  and  variance  calculated  by  the  cost 
risk  model  should  be  equal  to  the  analytical  total  cost  mean 
and  variance  resulting  from  equations  3  and  4. 

Consider  two  triangular  cost  distributions.  Both 
distributions  h  ve  a  range  from  0  to  1000  with  a  mode  of 
250.  The  correlation  is  limited  to  three  values:  -0.5,  0, 
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and  0.5.  The  Trial  colxunn  in  Table  2  reflects  the  choice  of 
the  random  number  seed  chosen  for  the  pseudorandom  number 
generator  and  these  values  are:  10,000  (random  seed  1), 
1,589,823,392  (random  seed  2),  776,519,062  (random  seed  3), 
1,817,216,169  (random  seed  4),  and  641,504,206  (random  seed 
5).  The  five  random  number  seeds  generate  independent 
random  number  strings  for  the  simulation.  Results  of  the 
simulation  runs  are  shown  in  Table  2. 


Table  2  Results  of  Tecolote  Risk  Methodology  validation 
criterion  2b 


CORRELATION 


ANALYTICAL 

RESULT 

TOTAL 

COST 

MEAN 

TOTAL 

COST 

VARIANCE 

833 

45,139 

833 

i 

90,278 

833 

135,417 

COMPARISON  MODEL 
RESULT 

1 - 1 

TOTAL 

COST 

MEAN 

TOTAL 

COST 

VARIANCE 

847 

46,320 

831 

47,510 

840 

44,707 

336 

46,615 

831 

44,789 

847 

92,474 

831 

95,310 

840 

89,220 

836 

93,797 

831 

89,753 

847 

138,963 

831 

142,539 

840 

133,804 

836 

139,948 

831 

134,380  1 
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clearly.  Table  2  exhibits  that  the  total  cost  mean  and 
variance  properly  reflect  the  analytical  values  calculated 
using  equations  3  and  4.  The  Mann-Whitney  non-parametric 
test  for  equivalent  means  was  used  to  test  the  analytical 
mean  against  the  simulation  mean  (5:224-229).  The 
analytical  values  from  equation  3  (see  column  2)  were 
compared  to  the  simulation  values  from  the  Comparison  Model 
(see  column  5).  The  means  from  the  two  methods  (analytical 
and  simulation)  are  equivalent  as  tested  by  Mann-Whitney  at 
the  90%  confidence  level.  Equation  4  states  that  the  total 
cost  variance  is  equal  to  the  s\im  of  the  cost  element 
variances  plus  two  times  the  covariance  between  the  cost 
elements.  Since  covariance  is  a  function  of  the  correlation 
coefficient,  the  total  cost  variance  should  vary  with 
correlation.  By  inspection  the  Tecolote  Risk  Methodology 
total  cost  variance  (see  column  6)  reflects  the  total  cost 
variance  calculated  analytically  (see  column  3)  from 
equation  4.  The  Tecolote  Risk  Methodology  passes  this 
criterion.  Other  distributions  and  correlations  were  tested 
but  are  not  included  in  this  documentation.  All  other 
trials  have  the  same  result. 

Criterion  2c  -  Distribution  Shapes.  Criterion  2c  states 
that  the  input  WBS  cost  element  probability  density  function 
shapes  should  be  the  same  as  the  output  shapes.  This 
chapter  will  describe  the  analysis  for  the  9  cases  that  have 
been  assumed  to  be  logically  consistent.  In  addition  case 
20  will  be  described  as  an  alternative  case  with  an 


59 


uncertain  range  of  logically  valid  correlations.  Although 
10  cases  in  all  are  discussed,  the  greatest  detail  will  be 
on  two  cases  (1  and  20). 

An  overview  of  the  general  findings  is  that  the  Tecolote 
Risk  correlation  methodology  distorts  the  second  cost 
distribution  of  the  two  WBS  cost  distributions  defined.  In 
fact,  except  for  correlation  values  near  0  (independence), 
the  post  factored  distribution  is  not  equal  to  the  user 
defined  distribution.  The  correlation  methodology  is  the 
multiplication  of  the  independent  distribution  random 
deviates  by  the  correlation  matrix  Cholesky  factor  (12:11). 
The  affect  on  the  distribution  is  as  shown  in  Figure  7  and 
is  indicated  by  three  sets  of  statistics.  The  first  is  a 
Chi-square  goodness  of  fit  test  between  the  pre  and  post 
factored  second  element  distributions.  The  second  is  the 
change  in  the  post  factored  distribution  skewness.  The 
third  is  the  change  in  the  range  (upper  and  lower  limits) 
and  mode  for  the  post  factored  distribution  (this  is 
referred  to  as  the  footprint  or  boundary  of  the 
distribution).  Note  that  since  the  change  in  skewness  and 
range  are  captured  by  the  Chi-square  goodness  of  fit  test, 
the  later  two  statistics  will  not  be  explicitly  tested;  they 
are  simply  a  visual  indication  of  the  change  in  shape. 

Note  that  since  the  range  of  the  individual  WBS  element 
distributions  are  altered,  so  then  is  the  total  cost 
distribution.  That  is,  if  the  user  defines  two  cost 
distributions  with  a  range  of  0  to  1000  with  any  mode  and 
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CASE  1  -  RAMXSM  SEED  1 
COimATION  =  0.6 


OOUARS 


OBT  »  rmta  -e-  cbt  »  tottact  | 


Figure  7  Case  1  -  Distribution  2  pre  and  post  factored  cost 
probability  density  functions 


any  non-zero  correlation,  the  total  cost  distribution  will 
have  a  range  from  less  than  zero  to  greater  than  2000.  The 
user  would  expect  the  total  cost  distribution  range  to  be 
from  0  to  2000. 

The  Chi-square  goodness  of  fit  graphs  are  the  primary 
output  from  criterion  2c.  To  aid  analysis  of  the  Chi-square 
goodness  of  fit  graphs,  the  boundary  graph  and  a  table  of 
the  interval  statistics  will  be  used.  A  description  of  how 
to  use  the  boundary  graph  and  interval  statistics  will  be 
provided  in  the  discussion  of  Case  1. 
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Case  1.  Distributions  1  and  2  are  identically 
distributed  cost  variables.  Distribution  1  and  2  are 
defined  over  the  range  0  to  1000,  each  with  a  mode  of  0. 
Figure  8  displays  the  Chi-square  goodness  of  fit  test 
statistic  and  critical  value  over  the  range  -0.9  <  p  <  0.9. 

The  goodness  of  fit  test  statistic  is  based  on  the 
difference  between  the  pre  and  post  factored  second 
distributions.  The  goodness  of  fit  test  statistic 
quantifies  what  is  visually  seen  in  Figure  7.  Figure  7 
exhibits  the  pre  and  post  factored  distribution  2  from  case 
1  random  seed  1.  When  p  =  0.6,  the  largest  interval 
statistic  is  in  the  interval  400  to  500.  This  can  be  seen 
in  Figure  7  as  well  as  in  Table  3  (p  =  0.6,  interval  400  to 
500,  the  interval  statistic  is  10).  The  mode  of  the 
distribution  has  changed  considerably  from  the  user  defined 
value.  Instead  of  being  a  right  skewed  right  triangle,  the 
post  factored  distribution  is  closer  to  being  symmetrical. 
Figure  7  further  indicates  that  the  postfactored  random 
deviates  are  being  chosen  outside  of  the  user  defined  range. 
Three  and  one-half  percent  (35  observations  /  1000  total 
observations  *  100)  of  the  postfactored  distvibution 
observations  occur  before  the  prefactoreU  distribution 
minimum  value.  This  means  that  the  cost  analyst  has  a 
negative  cost  35  out  of  1000  times  for  a  logically 
consistent  set  of  input  parameters. 

Figure  8  displays  the  90%  (a  =  0.10)  and  90%  (a  =  0.01) 
confidence  level  critical  values  for  17  degrees  of  freedom 
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Figure  8  Chi-square  test  for  Case  1  with  random  seed  1  and  17 
d.f . 

(d.f.).  From  Newbold,  the  critical  values  are  24.77  and 
33.41  respectively.  Any  test  statistic  that  exceeds  the 
critical  value  fails  the  Chi-square  goodness  of  fit  test 
(20:412-416,  832-833). 

When  interpreting  the  Chi-square  graphs,  any  correlation 
with  a  test  statistic  greater  than  the  critical  value  fails 
the  Chi-square  goodness  of  fit  test.  Remember  that  Case  1 
logical  input  correlations  are  limited  to  0  i  p  <  0.9.  For 
example  for  the  logical  input  parameters,  the  postfactored 
distribution  fails  the  goodness  of  fit  test  at  p  =  0.5  at 
the  90%  confidence  level.  This  means  that  for  correlations 


63 


ranging  from  0  to  0.4,  the  postfactored  distribution  is 
equivalent  to  the  prefactored  distribution.  At  the  99% 
confidence  level,  the  postfactored  distribution  never  fails 
the  goodness  of  fit  test.  The  user  should  expect  that  the 
distribution  pass  for  all  positive  correlations  at  either 
the  90%  or  99%  confidence  level.  In  observing  other  Case  1 
random  seed  trials,  the  maximtim  correlation  value  that 
passes  for  99%  confidence  is  at  p  =  0.4.  Therefore  in 
general  the  analyst  should  limit  the  correlation  between 
Case  1  distributions  to  0.4.  The  user  should  expect  that 
the  second  distribution  to  fail  the  Chi-square  test  for 
negative  correlations.  The  Chi-square  test  does  indeed  fail 
at  the  90%  confidence  level  at  p  <  -0.3.  The  relative  Chi- 
squats  test  statistic  is  greater  for  negative  correlations 
than  it  is  for  positive  correlations  f'>r  equally  distant 
correlations  from  the  origin.  That  is  if  the  user  compares 
the  Chi-square  statistic  at  -0.5  to  the  test  statistic  at 
0.5,  the  Chi-square  test  statistic  is  larger  for  the 
logi<'ally  inconsistent  correlation  definition. 

According  to  Newbold,  as  the  confidence  level  decreases, 
the  confidence  interval  around  the  expected  outcome 
decreases.  As  the  conlidence  level  varies,  there  is  a 
tradeoff  between  Type  I  and  Type  II  errors  assuming 
everything  else  remains  the  same.  Confidence  level  is  equal 
to  1  -  significance  level  (a).  A  Type  I  error  (significance 
level  or  a)  is  the  probability  of  rejecting  a  true  null 
hypothesis.  A  Type  II  error  (0)  is  the  probability  of 
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accepting  a  false  null  hypothesis.  As  the  confidence  level 
is  increased  (90%  to  99%),  the  probability  of  a  Type  I  error 
decreases.  However,  at  the  same  time  the  probability  of 
accepting  a  false  null  hypothesis  increases  (Type  II  error). 
Power  is  equal  to  1  -  (3.  Power  of  a  hypothesis  test  is 
correctly  rejecting  a  false  null  hypothesis  (20:329-335, 
377-382).  The  Chi-square  statistic  remains  constant  for  all 
confidence  levels.  However,  the  decision  to  accept  or 
reject  the  null  hypothesis  is  dependent  on  the  user’s 
acceptance  of  the  confidence  level  -  power  tradeoff.  The 
postfactored  distribution  may  pass  the  Chi-square  statistic 
at  99%  confidence  and  fail  at  90%  confidence.  The  shape  of 
the  distribution  or  the  Chi-square  statistic  does  not 
change,  only  the  acceptance  or  rejection  of  the  null 
hypothesis  changes. 

Recall  from  Chapter  III  that  the  null  hypothesis  is  that 
the  postfactored  second  WBS  cost  element  distribution  is 
equivalent  to  the  input  second  WBS  cost  element 
distribution.  Therefore,  if  the  confidence  level  is 
increased  from  90%  to  99%,  the  user  decreases  the 
probability  of  rejecting  a  null  hypothesis  when  the  he/she 
should  not.  The  Chi-square  hypothesis  test  confidence  level 
(critical  value)  is  chosen  to  be  at  99%.  This  reduces  the 
probability  of  a  Type  I  error. 

Two  other  measures  ease  the  analysis  of  why  the 
postfactored  distribution  is  different  than  the  input 
distribution.  The  two  measures  are  boundary  graphs  and 
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interval  statistic  tables.  The  boundary  graph  illustrates 
when  the  Chi-square  test  is  failing  because  the  distribution 
is  expanding  beyond  the  original  limits.  The  interval 
statistic  table  illustrates  which  interval  has  the  largest 
difference  between  the  prefactored  distribution  and  the 
postfactored  distribution. 

Figure  9,  Case  1  Boundary  Chart,  displays  the  maximum, 
minimum  and  mode  for  the  pre  and  post  factored  second 
distribution.  The  solid  box  (■)  indicates  the  prefactored 
distribution  minimum  value  location,  the  asterisk  (*) 
indicates  prefactored  maximum  value  location,  the  cross 
symbol  (X)  indicates  the  prefactored  modal  value  location, 
the  plus  sign  (+)  indicates  the  postfactored  minimum  value 
location,  the  open  box  (□)  indicates  the  maximum  value 
location,  and  the  filled  triangle  (a)  indicates  the 
postfactored  modal  value  location. 

Figure  9  is  related  to  Figure  7  by  showing  that  the 
minimum,  maximum  and  mode  change  as  a  function  of 
correlation.  Figure  7  indicates  that  the  first  observation 
from  prefactored  distribution  2  is  at  25.  The  reader  may 
confirm  this  with  Figure  9  at  correlation  =  0.6  where  the 
solid  box  is  approximately  25.  The  same  may  be  said  for  the 
maximum  and  modal  values  for  the  prefactored  and 
postfactored  distributions.  Figure  7  indicates  that  the 
postfactored  distribution  minimum  value  is  approximately 
-150,  the  same  as  Figure  9  for  correlation  =  0.6.  The 
maximum  value  of  the  postfactored  distribution  is  not  so 
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Figure  9  Boundary  chart  for  Case  1  with  random  seed  1 


easily  seen  in  Figure  7  as  it  is  in  Figure  9.  However,  it 
is  clear  that  there  is  at  least  one  observation  above  1000. 
Figure  9  shows  that  for  correlation  =  0.6,  the  maximum  value 
is  approximately  1100. 

Note  that  all  correlations  (-0.9  i  p  s  0.9)  are  tested 
in  Figure  8  and  9.  The  user  should  expect  a  larger  Chi- 
square  test  statistic  for  logically  inconsistent 
correlations.  This  is  exhibited  by  the  relatively  larger 
test  statistic  for  negative  correlations  than  those 
calculated  for  positive  correlations.  The  lower  bound  shown 
in  Figure  9  indicates  why  the  Chi-square  test  statistic  is 
so  large.  The  lower  bound  should  be  at  25  and  for  negative 
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correlations  the  lower  bound  ranges  from  approximately  -350 
to  -50.  The  extension  of  the  lower  and  upper  boundaries 
affect  the  total  cost  distribution.  If  the  user  should 
specify  Case  1  distributions  with  a  -0.9  correlation,  the 
total  cost  would  range  from  -350  to  whatever  the  maximum 
value  is  from  distribution  1.  Note  that  at  correlation  -0.9 
the  upper  distribution  bound  is  also  decreased  from  the 
original  limit.  The  total  cost  upper  limit  would  be  the 
maximum  cost  possible  from  distribution  1  plus  approximately 
900  from  distribution  2.  The  postfactored  mode  moves  toward 
the  center  of  the  distribution  as  correlation  decreases 
(becomes  more  negative).  The  minimum  value  of  the 
distribution  varies  as  a  function  of  correlation  and  even 
for  logically  consistent  correlations,  the  lower  bound 
decreases  into  the  negative  cost  range.  If  the  user  defined 
a  distribution  with  a  lower  limit  of  0,  the  Tecolote  Risk 
Model  would  actually  draw  negative  costs  from  the 
distribution . 

However,  negative  correlations  are  not  logically 
consistent  for  Case  1  distributions.  So  the  Tecolote  Risk 
Model  cannot  be  criticized  for  distorting  the  second 
distribution.  The  problem  is  that  for  logically  consistent 
correlations,  the  upper  and  lower  limits  of  the  distribution 
are  extended  also.  The  distribution  fails  the  Chi-square 
test  for  90%  confidence  at  p  =  0.5.  The  boundary  chart  for 
Case  1  shows  that  the  distribution  lower  limit  is  at  -100. 
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The  mode  of  the  postfactored  distribution  also  shifts  as 
correlation  varies.  Figure  9  shows  that  the  mode  of  the 
postfactored  distribution  p  =  0.6  has  shifted  from  50  to 
425.  The  interval  statistics  confirm  that  for  Case  1  random 
seed  1  that  this  is  indeed  the  case.  This  is  specific  to 
this  case  with  random  seed  1.  There  is  a  general  trend  of 
the  mode  to  shift  away  from  the  prefactored  distribution 
mode  as  the  correlation  is  increased  from  zero.  This 
general  trend  is  true  for  all  cases  and  random  number  seeds. 

The  interval  statistics  are  in  tabular  format  as  shown 
in  Table  3  for  Case  1  random  seed  1.  The  interval 
statistic,  the  (0;  -  E,)^/Ei's  from  each  interval  is  used  to 
evaluate  where  the  distribution  has  been  distorted  the  most 
for  the  evaluation  cases.  The  (0^  -  E^)Ve,2's  are  the 
interval  values  that  are  summed  to  the  Chi-square  test 
statistic  (20:414).  This  is  an  indication  of  where  the 
distribution  has  changed  the  most  in  shape. 

As  exhibited  in  Figures  7,  8  and  9,  the  post  factored 
distribution  is  distorted  in  test  case  1.  The  valid 
correlations  for  this  pair  of  distributions  range  from  0  <  p 
<  1.  For  random  seed  1,  17  degrees  of  freedom  (d.f.),  and 
90%  confidence  level  the  distribution  fails  at  p  =  0.5  and 
never  passes  over  the  remaining  range  to  p  =  0.9.  For  the 
same  test  with  99%  confidence,  the  distribution  never  fails 
over  the  same  range.  When  investigating  the  (0^  -  E.OVe^'s 
for  each  interval,  the  largest  interval  statistic  is  at  the 
two  lowest  intervals  of  the  distribution  (i.e.,  -infinity  to 
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Table  3  Case  1  -  Random  seed  1  interval  statistics 
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100)  for  all  correlations  except  one.  The  exception  is  at 
correlation  =  0.6. 

There  does  not  seem  to  be  a  consistent  pattern  in  where 
the  interval  statistics  are  the  largest  for  a  given  random 
number  seed  and  correlation.  That  is,  as  the  correlation  is 
varied,  the  Chi-square  test  statistic  varies,  but  the 
interval  that  differs  (prefactored  distribution  vs. 
postfactored  distribution)  the  most  varies. 
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when  observing  case  1  with  the  four  other  random  number 
seeds,  the  distribution  fails  the  goodness  of  fit  test  over 
a  different  correlation  range.  The  ranges  that  the 
distribution  passes  the  Chi-square  test  at  90%  confidence 
for  logical  correlations  and  random  seeds  2,  3,  4,  and  5 
are:  0  <  p<0.2,  0  <  p<0.2,  0  <  p<0.3,  and  0  <  p  < 

0.4.  The  results  at  90%  confidence  and  99%  confidence  are 
summarized  in  Table  4. 

Again  the  largest  interval  statistic  is  around  the 
middle  (around  500)  of  the  distribution  for  mid-value 
correlations  (0.4  to  0.7).  This  is  true  for  4  of  the  5 
random  seed  trials  for  case  1.  The  fifth  trial  has  the 
largest  interval  statistic  in  the  maximum  interval.  The 
Tecolote  Risk  Methodology  will,  with  this  pair  of 
distributions,  more  heavily  weight  the  center  of  the  defined 
distribution  than  what  it  should.  That  is,  more  draws  will 
come  from  around  500  than  the  user  originally  defined.  This 
means  that  the  Cholesky  decomposition  is  affecting  the  mode. 
Table  4  Range  of  acceptance  for  Case  1 
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The  Chi-square  goodness  of  fit  test  is  sensitive  to  the 
random  seed  and  to  the  Chi-square  test  classification 
interval  size.  The  random  number  seed  was  varied  for  all 
test  cases  with  5  different  random  seeds  and  the 
classification  interval  size  was  varied  over  3  values. 

Table  4  indicates  that  the  acceptance  range  varies  as  a 
function  of  the  input  random  seed.  This  research 
investigated  5  random  seeds  for  case  1  and  case  20  to 
maintain  manageability  in  the  data  set.  The  average 
acceptance  range  for  all  5  different  random  number  seeds  is 
shown  in  Chapter  IV  for  all  25  cases.  As  with  all 
simulations,  it  is  dangerous  to  make  conclusions  from  a 
small  number  of  replications  (14:287).  However,  the  data  at 
hand  does  appear  to  be  consistent.  At  the  90%  confidence 
level,  the  maximum  correlation  value  that  is  acceptable  is 
0.2<ps0.4.  At  the  99%  confidence  level,  with  the 
exception  of  random  seed  1,  the  maximum  acceptable 
correlation  is  0.3  £  p  i  0.4.  Random  seed  1  would  appear  to 
be  an  outlier  in  this  data  set. 

The  second  sensitivity  area  is  the  size  of  the  test 
interval.  Ten,  18,  and  34  classification  intervals  were 
investigated.  Law  and  Kelton  state  that  interval  sizing  in 
Chi-square  goodness  of  fit  test  is  a  difficult  problem 
(14:196).  The  interval  sizes  are  equal  with  the  exceptio:, 
of  the  first  and  last.  Law  and  Kelton  state  that  equal 
interval  sizes  are  not  required.  The  power  of  the  Chi- 
square  goodness  of  fit  statistic  is  dependent  on  the  number 
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of  classification  intervals  (14:196-197).  The  power  of  a 
hypothesis  test  refers  to  correctly  rejectin’  the  null 
hypothesis  when  the  null  hypothesis  is  false  (20:332).  The 
graphs  for  10,  18  and  34  classification  intervals  were 
viewed  and  18  was  selected  because  it  provided  the  best 
compromise  between  power  and  confidence.  In  general  with  34 
classification  intervals  the  range  of  acceptable 
correlations  decreased  from  those  values  depicted  with  the 
18  classification  intervals.  Results  from  10  classification 
intervals  were  similar  to  those  from  34  classification 
intervals.  With  10  classification  intervals,  the 
correlation  acceptance  range  was  narrower  than  with  18 
classification  intervals.  Thus,  18  classification  intervals 
is  conservative  in  that  it  provides  the  Tecolote  Risk 
Methodology  with  the  greatest  advantage. 

The  interval  statistics  are  sensitive  to  tne  random 
niimber  seed.  For  random  seed  1,  the  interval  statistic  are 
evenly  distributed  except  for  three  correlation  values.  For 
correlations  0.6  through  0.8,  the  largest  distribution  value 
starts  in  the  middle  and  migrates  to  the  minimum  value. 

This  is  exhibited  by  Figure  9,  the  boundary  chart  for  Case 
1,  in  that  the  mode  shifts  at  p  =  0.6  and  then  the  mode 
returns  to  a  more  smooth  migration.  The  shift  of  the  mode 
at  p  is  due  to  the  random  number  seed.  The  other  random 
number  seeds  did  not  display  this  exact  behavior  in  the 
mode.  No  strict  conclusion  may  be  made  about  the  fact  that 
the  mode  migrates  as  a  function  of  correlation.  Other 
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random  number  seeds  migrate  in  different  directions.  The 
interval  statistic  only  indicates  where  the  distribution 
changes  the  most.  There  is  no  consistent  interval  where  the 
distribution  fails  between  random  number  seeds.  Therefore, 
as  a  general  evaluation  tool,  the  interval  statistic  is  of 
limited  value.  However,  this  does  not  infer  that  the  Chi- 
square  test  statistic  is  invalid.  The  exact  cost  range  that 
the  distribution  fails  changes,  but  the  distribution  fails 
the  Chi-square  test  irregardless  of  which  random  seed  is 
chosen  for  mid  to  large  correlations.  The  distribution 
fails  in  different  quartiles  depending  on  the  random  seed 
chosen. 

As  stated  in  the  previous  paragraph  the  largest  interval 
statistic  is  dependent  on  the  random  number  seed.  This  is 
further  explained  by  the  following:  Random  seed  2  has  it's 
largest  interval  statistic  in  the  4th  quartile  of  the 
distribution.  Random  seed  3  has  the  largest  interval 
statistic  around  in  the  2nd  and  3rd  quartiles  of  the 
distribution.  Random  seed  4  has  it's  largest  Chi-square 
interval  statistic  in  the  2nd  and  3rd  quartile  of  the 
distribution.  Random  seed  5  has  the  largest  interval 
statistic  in  the  2nd  quartile.  Locating  a  single  area  of 
where  the  distribution  fails  to  pass  the  Chi-square  goodness 
of  fit  test  is  futile.  The  interval  is  too  sensitive  to  the 
random  number  seed. 
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However,  the  overall  Chi-square  test  statistic  is 
relatively  stable  both  in  range  and  absolute  value  for 
different  random  number  seeds. 

The  user  of  the  Tecolote  Risk  Model  (Air  Force  Risk 
Model)  should  limit  the  correlation  for  Case  1  to  -0.32  <  p 
<  0.44. 


Case  20.  Case  20  is  defined  with  distribution  1  and 
2  over  the  range  0  to  1000.  Distribution  I’s  mode  is  750 
and  distribution  2's  mode  is  1000.  The  logically  consistent 
correlation  range  is  uncertain  for  this  pair  of 
distributions . 

Figure  10  shows  how  distribution  2  is  distorted  at 
correlation  0.5.  With  reference  to  Figure  10,  distribution 
2  is  initially  a  left  skewed  right  triangle  and  at 
correlation  =  0.5,  the  postfactored  distribution's  mode  has 
shifted  left.  Three  and  three-thirds  percent  (3.3%)  of  the 
postfactored  distribution  is  greater  than  the  bound  for  the 
prefactored  distribution  at  correlation  =  0.5.  This  is  as 
shown  in  Figure  10.  The  interval  statistic  was  investigated 
the  same  way  as  for  Case  1.  The  interval  statistic  does  not 
behave  consistently  across  different  random  number  seeds. 
Therefore,  as  an  analysis  tool,  it  is  of  little  use  except 
for  pointing  out  exactly  where  the  largest  change  in  the 
distribution  occurred. 

It  is  clear  that  the  Chi-square  statistic.  Figure  11, 
itself  initially  increases  as  correlation  increases  from  0 
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CASE  20  -  RANDOM  SEED  1 
CORRELATION  -  0.5 


I  OBT  2  WKFACT  -B-  OgT  7  POCTriicf 


Figure  10  Case  20  random  seed  1  -  pre  and  post  factored  cost 
probability  density  functions 


to  0.5  and  then  remains  relatively  constant.  This  is  also 
true  for  Case  20  for  the  other  four  random  number  seeds  as 
wel  1 . 

Figure  12  shows  how  the  distribution  boundaries  increase 
as  the  correlation  moves  away  from  independence.  This 
supports  the  large  Chi-square  test  statistic  for  negative 
correlations  and  larger  positive  correlations. 

Table  5  indicates  that  there  is  greater  consistency  in 
Case  20  between  random  number  seeds  than  in  Case  1.  The 
random  number  seeds  are  the  same  for  both  cases.  No  offer 
of  an  explanation  is  made  in  the  regard  of  random  number 
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DIST  1  MODE  -  750,  RANDOM  SEED  1 
DIST  2  MODE  -  1 000 


Figure  11  Chi-square  test  for  Case  20  with  random  seed  1  and 
17  d.f. 


seeds  affecting  the  result  between  different  cases. 


However,  the  number  of  replications  made  for  this  research 


could  be  expanded  to  increase  the  data  set  and  fidelity  in 


the  conclusions. 
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Figure  12  Boundary  chart  for  Case  20  with  random  seed  1 


Table  5  Range  of  acceptance  for  Case  20 
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The  user  of  the  Tecolote  Risk  Model  (Air  Force  Risk 
Model)  should  limit  the  correlation  between  Case  20  type 
distributions  to  -0.34  £  p  s  0.3. 

Other  Cases.  Cases  5,  7,  9,  13,  17,  19,  21  and  25 
will  be  discussed  in  summary  terms  as  well  as  all  25  cases. 
The  range  of  correlation  logical  consistency  (column  4)  for 
the  cases  are  as  shown  in  Table  6.  Table  6,  columns  2  and  3 
indicates  the  range  of  correlation  acceptance  for  all  25 
cases  (results  averaged  for  5  random  seeds).  The  Air  Force 
Risk  Model  user  should  limit  the  correlation  between  pairs 
of  distributions  to  the  values  shown  in  column  3  (99% 
confidence) . 

Table  6  The  average  acceptance  range  for  all  25  cases 


The  average  of  5  randan  nunnber  S( 
cases  at  90%  and  99%  confidence  ] 

5ed  acceptance  ranges  for  all  25 
Levels 

90%  Confidence 

99%  Confidence 

Logical 

consistency 

range 

CASE  1 

-0.20  <  ft  <  0.30 

-0.32  £  p  £  0.44 

0  £  p  <  1 

CASE  2 

-0.20  £  p  <  0.20 

-0.30  £  p  i  0.36 

UNCERTAIN 

CASE  3 

-0.30  <  p  <  0.44 

-0.54  £  p  £  0.50 

UNCERTAIN 

CASE  4 

-0.32  £  p  <  0.20 

-0.40  <  p  £  0.26 

UNCERTAIN 

CASE  5 

-0.34  £  p  £  0.22 

-0.42  £  p  £  0.28 

1 

1 

-0.22  £  p  £  0.30 

-0.32  £  p  £  0.46 

UNCERTAIN 

CASE  7 

-0.20  £  p  £  0.24 

-0.28  £  p  £  0.34 

0  £  p  <  1 

CASE  8 

-0.36  £  p  £  0.40 

-0.48  £  p  £  0.56 

UNCERTAIN 

CASE  9 

-0.28  £  p  £  0.18 

-0.36  £  p  £  0.24 

1 

1 

CASE  10 

-0.26  £  p  £  0.24 

-0.44  £  p  £  0.34 

UNCERTAIN 

-0.22  £  p  £  0.30 

-0.38  £  p  £  0.42 

UNCERTAIN 

-0.24  £  p  £  0.24 

-0.28  £  p  £  0.38 

UNCERTAIN 
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Tadsle  6  Range  of  acceptance  for  25  cases  continued 


1  The  average  of  5  random  nunber  S( 
1  cases  at  90%  and  99%  confidence 

sed  acceptance  ranges  for  all  25  I 

levels  1 

90%  Confidence 

99%  Confidence 

Logical 

consistency 

range 

CASE  13 

-0.48  <  p  <  0.56 

-1  <  p  <  1 

-0.26  <,  p  <  0.22 

-0.32  <  p  <  0.28 

UNCERTAIN 

1  C^  15 

-0.26  <  p  <  0.26 

mmmmm 

UNCERTAIN 

-0.28  <  p  <  0.30 

-0.38  <  p  <  0.40 

UNCERTAIN 

CASE  17 

-0.20  <  p  <  0.20 

-0.30  <  p  <  0  32 

1 

1 

CASE  18 

-0.42  <  p  <  0.46 

-0.52  <  p  <  0.70 

UNCERTAIN  | 

-0.14  <  p  <  0.26 

-0.30  <  p  <  0.32 

V 

VI 

o 

CASE  20 

-0.22  <  p  <  0.24 

-0.34  <  p  <  0.30 

UNCERTAIN  | 

CASE  21 

-0.22  <  p  s  0.22 

-0.36  <  p  s  0.38 

■ 

1 

CASE  22 

-0.22  S  p  S  0.26 

-0.32  <  p  <  0.28 

UNCERTAIN 

CASE  23 

-0.40  s  p  <  0.54 

-0.50  S  p  <  0.68 

UNCERTAIN 

CASE  24 

-0.20  <  p  <  0.26 

-0.40  s  p  s  0.40 

UNCERTAIN 

CASE  25 

-0.18  <  p  <  0.24 

-0.30  <  p  <  0.38 

0  <  p  <  1 

Cases  3,  8,  13,  18,  and  23  which  defined  the  symmetrical 
distribution  for  the  second  distribution  had  the  largest 
correlation  range  of  acceptance.  Specifically,  Case  13  had 
the  widest  range  of  acceptance  of  all  cases  tested.  Case  13 
is  as  expected  since  the  logical  correlation  range  for  it  is 
-1  <  p  <  1.  Cases  11,  12,  14  and  15  which  had  the 
symmetrical  distribution  for  the  first  distribution  did  not 
have  the  same  advantage  in  acceptance  range. 

Case  13  is  interesting  because  both  distributions  are 
symmetrical.  The  Cholesky  decomposition  is  suggested  by 
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Johnson  as  a  random  deviate  correlation  method  for  normal 
variates  (13:52-55).  Although  the  symmetrical  triangular 
distribution  is  not  normal ,  it  does  infer  that  symmetrical 
distributions  may  be  able  to  utilize  the  Cholesky 
decomposition  to  correlate  random  deviates  over  a  wide  range 
of  correlations. 

Tecolote  Risk  Model  Verification 

Of  the  three  criteria  described  in  Chapter  III,  only  2b 
can  be  used  to  verify  the  implementation  of  the  Tecolote 
Risk  Model.  Criterion  2a  has  been  accomplished 
mathematically.  However,  it  remains  unknown  if  the  Cholesky 
factor  has  been  applied  correctly  in  the  Air  Force  Risk 
Model .  An  investigation  of  the  Air  Force  Risk  Model 
computer  program  would  be  necessary.  Criterion  2c  would 
require  access  to  the  model's  random  deviates.  The  Tecolote 
Risk  Model  does  not  allow  access  to  the  random  deviates. 
Thus,  this  research  can  only  investigate  the  difference 
between  the  simulation  and  analytical  total  cost  summary 
statistics  (criterion  2b). 

Verification  Criterion  2b.  The  verification  of  the 
Tecolote  Risk  Model  (Air  Force  Risk  Model)  has  been  done 
with  the  "Riskmain.exe"  file  dated  18  February  91.  The 
random  number  seed  cannot  be  controlled  in  this  version  of 
the  Tecolote  Risk  Model.  Therefore,  the  random  number  seed 
is  both  unknown  and  non-repeatable.  The  input  parameters 
are  the  same  as  used  for  the  validation  process  and  are  two 
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triangular  cost  distributions  with  a  range  of  0  to  1000  with 
a  mode  of  250.  The  correlation  is  varied  over  three  values. 
Other  trials  with  different  distributions  and  correlations 
were  tested  with  the  results  being  the  same.  Table  6  shows 
the  result  of  the  Tecolote  Risk  Model  verification. 


Table  7  Results  of  Tecolote  Risk  Model  verification 
criterion  2b 


CORRELATION 

ANALYTICAL 

RESULT 

TRIAL 

TECOLOTE  RISK 

MODEL 

RISKMAIN.EXE 

DATED  18  FEB  91 

-0.5 

TOTAL 

COST 

MEAN 

TOTAL 

COST 

VARIANCE 

TOTAL 

COST 

MEAN 

TOTAL 

COST 

VARIANCE 

833 

45,139 

1 

850 

88,578 

2 

825 

85,270 

3 

836 

91,041 

4 

826 

82,042 

5 

849 

88,500 

0 

833 

90,278 

1 

812 

87,243 

2 

823 

89,467 

3 

832 

86,013 

4 

830 

88,881 

5 

833 

79,738 

0.5 

833 

135,417 

1 

836 

88,923 

2 

823 

84,187 

3 

828 

84,048 

4 

817 

88,703 

5 

816 

84,856 
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clearly  Table  7  results  show  that  the  total  cost  mean 
behaves  as  would  be  expected  from  equation  3.  The 
analytical  mean  calculated  using  equation  3  (see  column  2) 
is  roughly  equivalent  to  the  mean  from  the  Tecolote  Risk 
Model  (see  colximn  5).  The  Mann-Hhitney  test  for  equivalent 
means  was  used  to  verify  the  equivalence  of  the  simulation 
mean  and  the  analytical  mean  (5:224-229).  The  test  showed 
that  at  the  90%  confidence  level  the  sum  of  the  means  from 
the  analytical  solution  is  equal  to  the  simulation  mean. 
However,  the  total  cost  variance  (see  column  3)  does  not 
reflect  what  would  be  expected  from  equation  4  (see  column 
6).  For  example,  at  correlation  =  -0.5,  the  variance  should 
be  45,139.  The  Tecolote  Risk  Model  generates  variances 
between  82,042  and  91,041.  At  p  =  0.5,  the  same  problem  is 
exhibited.  The  total  cost  variance  appears  to  be  unaffected 
by  the  correlation  coefficient  since  the  average  for 
correlations  -0.5,  0,  and  0.5  are  respectively:  81,686, 
86,268,  and  86,143.  The  total  cost  variance  has  a  general 
tendency  for  the  case  of  independence.  Equation  4  states 
that  the  total  cost  variance  is  equal  to  the  sum  of  the  cost 
element  variances  plus  two  times  the  covariance  between  the 
lower  level  cost  elements.  Since  covariance  is  a  function 
of  the  correlation  coefficient,  the  total  cost  variance 
should  vary  with  correlation.  The  Tecolote  Risk  Model  (Air 
Force  Risk  Model;  * riskmain. exe  dated  18  February  1991)  does 
not  pass  this  verification  criteria. 
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Internal  Validity 

The  results  for  criterion  2b  were  tested  for  sensitivity 
to  random  number  seed.  The  results  are  that  they  are  not 
sensitive  to  the  random  number  seed. 

Sensitivity  analysis  for  Chi-square  test  classification 
interval  size  and  to  random  number  seed  were  performed  for 
criterion  2c.  The  result  is  that  the  test  is  sensitive  to 
both  parameters.  The  general  trend  is  that  the  results  are 
valid  in  a  broad  perspective.  That  is,  if  the  user  limits 
the  correlation  coefficient  to  low  values  for  logically 
shaped  distributions,  the  total  cost  distribution  will 
probably  be  valid. 
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V.  Conclusion  and  Recommendations 


Concl uslons 

Air  Force  Risk  Model.  The  Air  Force  Risk  Model 
(Tecolote  Risk  Model,  * riskmain. exe"  file  dated  18  February 
1991)  is  not  a  valid  implementation  of  the  Tecolote  Risk 
Methodology.  The  total  cost  distribution  variance  does  not 
correspond  to  the  analytically  determined  values.  Tecolote 
Research,  Inc.  was  notified  of  this  problem  on  28  May  1991 
and  they  have  located  a  software  problem. 

Tecolote  Risk  Methodology.  Once  the  user  has  defined 
logically  consistent  input  parameters  the  risk  methodology 
can  be  tested.  This  has  been  accomplished  in  this  research 
for  triangular  distributions.  Criteria  2a,  2b  and  2c  were 
used  to  evaluate  the  Tecolote  Cost  Risk  Methodology. 

Criterion  2a,  states  that  the  user  defined  component 
correlations  should  be  maintained  through  the  cost  risk 
model  (i.e.,  input  p  =  output  p) .  This  has  been  shown  by 
Book  and  Young  mathematically  to  be  true  for  the  Tecolote 
Risk  Methodology  (4:11).  Therefore,  the  Tecolote  Risk 
Methodology  satisfies  criterion  2a. 

Criterion  2b,  states  that  the  total  cost  mean  and 
variance  calculated  by  the  cost  risk  model  should  be  equal 
to  the  analytical  total  cost  mean  and  variance.  This  has 
been  shown  to  be  the  case  through  the  simulation. 

Therefore,  the  Tecolote  Risk  Metnodology  satisfies  criterion 
2b. 
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Criterion  2c,  states  that  the  input  WBS  cost  element 
probability  density  fxinction  shapes  should  be  the  same  as 
the  output  shapes.  This  has  been  tested  with  the  Chi-square 
goodness  of  test.  Twenty-five  cases  of  triangular 
distribution  pairs  with  the  correlation  coefficient  varied 
from  -0.9  to  0.9  were  tested.  Several  of  these  cases  were 
identified  as  consistent  input  parameters  and  therefore 
serve  to  test  the  hypothesis.  The  Tecolote  Risk  Methodology 
satisfies  criterion  2c  under  limited  conditions.  There  is  a 
narrow  range  of  acceptable  correlations  allowed  to  be  input 
to  the  model.  The  cost  analyst  should  not  use  a  correlation 
greater  than  0.4  (-0.4)  for  distributions  that  are  assumed 
to  have  positive  (negative)  logical  correlation. 

The  Tecolote  Risk  Methodology  is  valid  under  tight 
constraints.  The  Chi-square  goodness  of  fit  test  indicates 
that  the  model  is  distorting  the  user  defined  cost 
distributions.  This  author  recommends  the  usage  of  the  18 
classification  interval  (17  degrees  of  freedom)  for  the 
determination  of  where  the  cost  distributions  are  not 
distorted.  There  is  a  difference  in  the  Chi-square  test 
statistic  for  10,  18  and  34  classification  intervals.  Ten 
and  18  classification  intervals  provide  tighter  constraints 
for  valid  correlation  input  param»»ters  given  a  distribution 
shape . 

The  Tecolote  Risk  Model  allows  input  of  triangular, 
beta,  and  uniform  distributions  only  triangular  distribution 
were  explicitly  tested.  However,  an  extrapolation  from  this 
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data  set  may  be  inferred  to  the  beta  distribution.  Both  the 
triangular  and  beta  distribution  have  finite  limits, 
multiple  skewness  coefficients,  and  a  single  mode  (the  beta 
distribution  is  more  flexible  in  that  variance  may  also  be 
varied).  The  analyst  may  assume  that  the  beta  distribution 
will  be  distorted  similarly  to  the  triangular  distribution. 

Reconmendations 

There  are  two  types  of  recommendations.  The  first 
recommendation  type  is  how  the  user  should  apply  the  Air 
Force  Risk  Model  and  the  second  is  recommendations  to  the 
Air  Force  Risk  Model  developer. 

The  cost  analyst  should  limit  the  use  of  the  Air  Force 
Risk  Model  (Tecolote  Risk  Model)  to  relatively  small 
correlations.  More  specifically,  the  user  should  limit  the 
correlations  to  the  values  shown  in  Table  6. 

The  cost  analyst  should  calculate  the  analytical  total 
cost  mean  and  variance  as  a  cross-check  for  the  Tecolote 
Risk  Model . 

The  remainder  of  this  section  is  to  the  Air  Force  Risk 
Model  developer.  The  implementation  should  be  modified  to 
include  the  seed  and  display  the  output.  The  user  should  be 
able  to  input  the  random  seed  number  for  any  risk  analysis 
"run".  This  allows  repeatability  of  the  simulation  to  make 
sensitivity  analysis  less  difficult.  Since  all  simulation 
studies  should  be  based  on  multiple  replications,  the  model 
should  have  the  capability  to  calculate  the  average  of 
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independent  replications.  This  could  be  offered  as  an 
option  in  the  menu  tree. 

The  Air  Force  Risk  Model  should  display  both  the 
probability  distribution  function  and  the  cumulative 
distribution  function  at  every  level  in  the  WBS  structure. 
The  order  that  the  mean,  mode  and  standard  deviation  are 
displayed  on  the  graphs  in  the  model  should  be  consistent. 

Further  Research.  An  investigation  into  alternative  C  = 
LL‘  factorization  algorithms  should  be  made.  The  Cholesky 
decomposition  produces  a  repeatable  correlation  matrix 
factor.  However,  the  L  factor  is  not  unique.  Other  factors 
exist,  and  these  may  not  distort  the  cost  distributions  as 
much  as  Cholesky  decomposition.  A  similar  test  to  the  one 
accomplished  in  this  research  may  be  replicated  for  other 
factorization  algorithms. 

An  investigation  of  the  affect  the  other  three 
uncertainty  types  (schedule,  technology,  and  configuration) 
allowed  in  the  Air  Force  Risk  Model  have  on  cost. 

An  investigation  into  the  Cholesky  decomposition  affect 
on  the  beta  and  uniform  distribution  is  recommended.  The 
Air  Force  Risk  Model  accepts  input  of  beta,  triangular  and 
uniform  cost  distributions.  The  same  criteria  used  in  this 
research  could  be  used  for  the  beta  and  uniform 
distributions . 

An  investigation  of  the  affect  that  Cholesky 
decomposition  has  on  the  "ith"  cost  element.  Assuming  cost 
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dependencies,  how  is  the  "10th”  WBS  cost  element  affected  by 
the  previous  9  cost  elements? 

Research  should  be  done  on  how  to  identify  the  proper 
correlations  between  WBS  elements.  That  is,  which  data 
should  be  used  (Tls,  total  cost,  down  some  learning  curve, 
etc)  . 

Does  the  correlation  matrix  change  as  a  function  of 
program  maturity?  That  is,  one  might  expect  a  dense 
correlation  matrix  for  new  programs  and  a  sparse  correlation 
matrix  for  mature  programs. 

If  a  correlation  matrix  is  not  positive  definite,  there 
is  no  current  method  to  identify  the  pair(s)  of  WBS  elements 
that  are  not  consistent. 
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ODE  -  0,  RANDOM  SEED  1 
DIST  2  MODE  -  0 


Appendix  A;  Case  1  Data 
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Figure  13  Chi-square  '  est  for  Case  1  with  random  seed  1  and  17 


Figure  14  Boundary  chart  for  Case  1  with  random  seed 


RANDOM  SEED  2 
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Figure  15  Ciii-square  test  for  Case  1  with  random  seed  2  and  17 


RANDOM  SEED  3 


aqudre  test  for  Case  1  with  random  seed  3  and 


94 


est  for  Case  1  wiLh  random  seed  4  and  17 


ODE  -  0,  RANDOM  SEED  5 
DIST  2  MODE  -  0 


o 
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Figure  18  Chi-square  test  for  Case  1  v/ith  random  seed  5  and  17 
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quare  test  for  Case  2  with  random  seed  1  and  17 
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Figure  20  Bounca.y  chart  for  Case  2  with  random  seed 
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Figure  21  Chi-square  test  for  Case  3  v/ith  random  seed  1  and  17 
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Figure  23  Chi-square  test  for  Case  4  vjith  random  seed  1  and  17 
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Figure  24  Boundary  chart  for  Case  4  v/ith  random  seed 
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Figure  25  Chi-square  test  for  Case  5  witli  random  seed  1  and  17 
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Figxire  28  Boundai'y  chart  for  Case  6  with  random  seed 
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est  for  Case  7  with  random  seed  1  and  17 


107 


Figure  30  Boundary  chart  for  Case  7  with  random  seed 
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Figure  31  Chi-square  test  for  Case  8  v/ith  random  seed  1  and  17 


LOW-O,  MODE -250,  HIGH-1000 
LOW-0,  MODE-500,  HIGH-1000 


109 


Figure  32  Boundary  chart  for  Case  8  v/ith  random  seed 
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Chi-square  test  for  Case  9  with  random  seed  1  and  17 
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Lgure  34  Boundary  chart  for  Case  9  v/ith  random  seed 


Appendix  J:  Case  10  Data 
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Figure  35  Chi-square  test  for  Case  10  with  random  seed  1  and  17 
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Figure  36  Boundary  chart  for  Case  10  with  random  seed 


Appendix  K:  Case  11  Data 
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Figure  37  Chi-square  test  for  Case  11  with  random  seed  1  and  17 
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Figure  38  Boundary  chart  for  Case  11  with  random  seed 
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Figure  40  Boundary  chart  for  Case  12  v/ith  random  seed 
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Figure  41  Chi-square  test  for  Case  13  with  random  seed  1  and  17 
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Figure  42  Boundary  chart  for  Case  13  with  random  seed 
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Appendix  N:  Case  l4  Data 
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Figure  43  Chi-square  test  for  Case  14  with  random  seed  1  and  17 
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Figxire  44  Boundary  chart  for  Case  14  with  random  seed 
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Figxire  45  Chi-square  test  for  Case  15  with  random  seed  1  and  17 


1  LOW-0,  MODE-500,  HIGH-1000 
;  LOW-0,  MODE-1 000,  HIGH-1000 


123 


Figure  46  Boundary  chart  for  Case  15  with  random  seed 
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Figure  47  Chi-square  test  for  Case  16  with  random  seed  1  and  17 
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Figure  48  Boundary  chart  for  Case  16  with  random  seed 
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Figure  49  Chi-square  test  for  Case  17  with  random  seed  1  and  17 
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Figure  50  Boundary  chart  for  Case  17  with  random  seed 
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Ficfure  51  Chi-square  test  for  Case  18  with  random  seed  1  and  17 
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Figure  52  Boundary  chart  for  Case  18  with  random  seed 
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Figure  53  Chi-square  test  for  Case  19  with  random  seed  1  and  17 
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Figxire  54  Boundary  chart  for  Case  19  with  random  seed 
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Figure  56  Boundary  chart  for  Case  20  with  random  seed 
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Figure  60  Chi-square  test  for  Case  20  with  random  seed  5  and  17 
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Figure  61  Chi-square  test  for  Case  21  with  random  seed  1  and  17 
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Figure  62  Boandary  chai  L  for  Case  21  v;ith  random  seed 
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igure  63  Chi-square  test  for  Case  22  with  random  seed  1  and  17 
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Figxire  66  Boundary  chart  for  Case  23  with  random  seed 
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Figure  70  Boundary  chart  for  Case  25  with  random  seed 
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